I'm trying to offer constructive, if harsh, criticism based on my own experience which includes recruiting for similar positions and working with large and small 501(c)(3)'s.
I don't mean to come off as dismissive but to suggest that your write up is vague to the point of being easily dismissed and provide feedback on how someone from outside your local peer group might read this.
And there are organizations out there with great IT and clean data but I and most people in this field have lost months writing hideous combinations of NLP and regular expression to pull data out of old medical records and things and hand validate it or correct for batch effect in supposedly clean data.
I think that fleshing out the projects and areas of investigation you guys already have lined up would go a long ways towards addressing my concerns and making the program more appealing to the typical analytical folks i've worked with. I'd also suggest focusing the intensive course on analytical methods not the tools, this is what will intrigue people with expertise. At the moment it reads like it is focused at people new the the field with no programing experience.
What data sets/types are you using for the Parkinson's thing? My main focus is on analysis methods that resist the noise, imbalance, heterogeneity and other issues typical in extremely wide/multivariate genetic+clinical+proteomic studies...a few sentences about the study in the write up would have told me a lot about if my skills could be useful. (I'm not looking to relocate but I am always open to collaborations and correspondence with people working on similar things.)
As I said earlier -- I definitely appreciate the sentiment, and constructive criticism is always welcome when actually substantiated. I also took your post as an opportunity to elaborate a bit more on our model so my post got longer as a result.
> And there are organizations out there with great IT and clean data but (...)
This argument also works the other way round -- there are organizations out there with terrible data (and this is especially common with medical data), but there are also many high impact projects for which the data does exist in a workable form that are begging to be solved (and that we are actually working on solving). We are focusing on these in the short term, while laying the groundwork for the others in the medium-long term (both through the research arm we are building, and our data engineers). There is no reason not to get the low-hanging fruit first.
> I think that fleshing out the projects and areas of investigation you guys already have lined up (...)
Agreed. Since we created Bayes Impact two months ago our main focus has been on building the program from scratch and working on the projects as well, so the website has unfortunately taken a backseat. Another problem is that government organizations are very sensitive about communication and we can only communicate about our projects on their timeline. This results in us not having a website as fleshed out as we'd like, but this is par for the course for a new organization.
> I'd also suggest focusing the intensive course on analytical methods not the tools
Ah, I just saw the paragraph you're referring to. I get how the language may be a bit confusing and will make the appropriate changes -- our goal is actually to do the opposite: we bring on individuals who already have the analytical methods but some may not have had exposure to best industry practices. Because we focus on building production systems and not just write case studies, it's important to bring them up to speed in that minor respect. This is why we can spend only a week teaching tools -- teaching analytical methods to people without the required background would likely take much longer, which is not our target audience.
At a broad level we simply provide an avenue for data scientists to work on social impact problems in collaboration with domain experts, with us taking care of the overhead of scoping projects and doing the dirty work of acquiring and preparing the data as well as defining the implementation strategy. We also smooth out the edges in our Fellows' backgrounds if any but this is really not the core of the program.
Fortunately the pool of applicants as well as our current fellows does not seem to echo your fears but I'll review and see which changes to the fellowship page could help remove ambiguities in the future.
Hope it helps clarify. Regarding the Parkinson's project, feel free to reach out to me by email -- unfortunately we need to wait for the press release from the MJFF and the other partner before I can actually communicate about the details publicly.
Seems like you've got big data problems to solve and data scientists up the wazoo.
I would think the missing element would include avant problem-solvers, regardless of (advanced) degrees or not
who are as outstanding in that specialty as the data
scientists are in theirs.
I don't mean to come off as dismissive but to suggest that your write up is vague to the point of being easily dismissed and provide feedback on how someone from outside your local peer group might read this.
And there are organizations out there with great IT and clean data but I and most people in this field have lost months writing hideous combinations of NLP and regular expression to pull data out of old medical records and things and hand validate it or correct for batch effect in supposedly clean data.
I think that fleshing out the projects and areas of investigation you guys already have lined up would go a long ways towards addressing my concerns and making the program more appealing to the typical analytical folks i've worked with. I'd also suggest focusing the intensive course on analytical methods not the tools, this is what will intrigue people with expertise. At the moment it reads like it is focused at people new the the field with no programing experience.
What data sets/types are you using for the Parkinson's thing? My main focus is on analysis methods that resist the noise, imbalance, heterogeneity and other issues typical in extremely wide/multivariate genetic+clinical+proteomic studies...a few sentences about the study in the write up would have told me a lot about if my skills could be useful. (I'm not looking to relocate but I am always open to collaborations and correspondence with people working on similar things.)