Data Engineer (Research)

Full time

Job Title: Data Engineer


Job Type: Full-time, Contractor


Location: Remote


Job Summary

We are seeking a skilled Data Engineer with strong analytical thinking and a passion for solving research-driven data challenges. This exciting role involves building data pipelines, performing exploratory data analysis (EDA), and working with both structured and unstructured data. Your exposure to AI/ML techniques will be advantageous as you collaborate with data scientists and researchers to derive insights and support model development.


Key Responsibilities

- Design, develop, and maintain robust and scalable ETL pipelines for ingesting and transforming raw data from diverse sources.

- Conduct exploratory data analysis (EDA) to identify patterns, anomalies, and valuable insights.

- Collaborate with researchers and data scientists to prepare datasets for AI/ML modeling and experimentation.

- Develop and manage data models, schemas, and databases for efficient storage and querying of large datasets.

- Write optimized SQL queries and scripts for data extraction and aggregation.

- Ensure data quality, integrity, and security across all pipelines and storage systems.

- Automate data validation and reporting workflows to facilitate ongoing research tasks.


Required Skills and Qualifications

- Proficiency in Python and SQL.

- Experience in building and managing ETL pipelines.

- Expertise in using Pandas and NumPy for data manipulation.

- Familiarity with AI/ML techniques and tools such as scikit-learn, Hugging Face Transformers, and OpenAI API.

- Strong written and verbal communication skills.

- Experience with databases like PostgreSQL and MySQL.

- Knowledge in exploratory data analysis tools including Jupyter Notebooks, VS Code, or PyCharm.


Preferred Qualifications

- Experience with vector search tools like Qdrant, FAISS, or Pinecone.

- Familiarity with data visualization tools such as Matplotlib, Seaborn, or Plotly.