Data Scientist
Posted 50ds ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Data Scientist at Protege bridging healthcare data and customer needs. Collaborating with customers to ensure data alignment with AI models in the healthcare vertical.
Responsibilities:
- Conduct feasibility analyses by querying healthcare datasets to assess patient cohort availability based on complex inclusion/exclusion criteria (i.e. procedures, diagnoses, diversity, longitudinal completeness, regulatory constraints)
- Collaborate directly with customers to understand their use cases and support effective data integration
- Ensure customers have a clear understanding of the data’s structure, limitations, and strengths
- Identify gaps in our data offerings and provide insights to our partnerships team on the highest-priority data acquisitions
- Evaluate potential data partnerships, ensuring the data is high-quality, well-documented, and commercially viable
Requirements:
- Undergraduate or MS/BS + industry experience in a quantitative field such as mathematics, economics, statistics, biostatistics, bioinformatics, computer science, or data science
- Proficiency with programming in R/Python/SQL
- Hands-on experience working with large-scale healthcare datasets, including one or more of the following: imaging, EHR, genomics, claims, or pathology data
- Team-player, no job is too big or too small
- You are an eager researcher and you are not afraid to learn or face a knowledge pit.
- You treat those around you with kindness
- Bonus if you have these attributes:
- Experience in a customer facing role
- Experience with data optimization techniques such as model-based filtering, multimodal data integration, heuristic filtering, and/or target distribution matching
- Experience applying machine learning or logistic regression techniques to healthcare data
- Familiarity with third-party data certification or audit processes related to privacy and data quality
- Ability to think creatively and insightfully about large scale data problems


















