Job Title: Data Engineer
Job Summary:
We are looking for a Fresher Data Engineer to assist in building, maintaining, and optimizing data pipelines and data infrastructure for scalable analytics and business intelligence solutions. The ideal candidate should have a strong foundation in SQL, Python, data modeling, ETL processes, and cloud-based data solutions. This role involves working with large datasets, ensuring data integrity, and collaborating with data scientists, analysts, and software engineers to support data-driven decision-making.
Key Responsibilities:
- Assist in designing, developing, and maintaining scalable data pipelines for processing large datasets
- Work with ETL (Extract, Transform, Load) processes to ingest and transform structured and unstructured data
- Write and optimize SQL queries for data extraction, cleaning, and transformation
- Collaborate with data analysts and scientists to ensure efficient data workflows and availability
- Assist in building data models and schemas for analytical and operational use cases
- Learn and work with big data technologies (Apache Spark, Hadoop, Airflow) for processing and scheduling workflows
- Work with cloud data platforms (AWS, Azure, Google Cloud) for data storage, processing, and retrieval
- Support data integrity, validation, and quality control processes
- Document data engineering workflows, schemas, and best practices
- Stay updated on modern data engineering tools, trends, and best practices
Skills and Knowledge Required:
- Proficiency in SQL (MySQL, PostgreSQL, SQL Server)
- Basic knowledge of Python, Scala, or Java for data processing
- Understanding of database management and data modeling (normalization, indexing, partitioning)
- Familiarity with ETL tools (Apache NiFi, Talend, Informatica, dbt, SSIS)
- Exposure to cloud data solutions (AWS RDS, Google BigQuery, Azure Synapse)
- Understanding of data warehousing concepts (fact and dimension tables, star schema)
- Familiarity with big data frameworks (Spark, Hadoop, Hive) (optional)
- Basic knowledge of workflow orchestration tools (Apache Airflow, Prefect, Luigi)
- Good analytical and problem-solving skills
Educational Qualifications:
- Bachelor’s degree in Computer Science, Data Science, Information Technology, or a related field
- Certifications in SQL, Cloud Data Engineering (AWS/GCP/Azure), or ETL Tools are a plus
Experience:
- 0-1 year of experience in data engineering, database management, or ETL processes
- Experience with academic projects, internships, or Kaggle competitions preferred
Key Focus Areas:
- Data Pipeline Development & Optimization
- ETL Processes & Data Integration
- Cloud-Based Data Engineering & Storage
- SQL Query Optimization & Data Modeling
Tools and Technologies:
- Programming Languages: SQL, Python, Scala (optional)
- Databases & Warehouses: MySQL, PostgreSQL, SQL Server, Oracle, Google BigQuery, Snowflake
- ETL & Data Pipeline Tools: Apache NiFi, Talend, Informatica, dbt, SSIS
- Big Data & Workflow Orchestration: Apache Spark, Hadoop, Hive, Airflow
- Cloud Platforms: AWS RDS, Google Cloud SQL, Azure Data Factory
- Version Control & DevOps: Git, Terraform, Docker
Other Requirements:
- Willingness to learn modern data engineering technologies
- Ability to work in a team and collaborate with data scientists & analysts
- Strong problem-solving skills and attention to detail
- Good documentation and communication skills