About the Role
Responsibilities
• Create and maintain optimal data pipeline architecture; assemble large, complex data sets that meet functional / non-functional requirements.
• Design the right schema to support the functional requirement and consumption patter.
• Design and build production data pipelines from ingestion to consumption.
• Build the necessary datamarts, data warehouse required for optimal extraction, transformation, and loading of data from a wide variety of data sources.
• Create necessary preprocessing and postprocessing for various forms of data for training/ retraining and inference ingestions as required.
• Create data visualization and business intelligence tools for stakeholders and data scientists for necessary business/ solution insights.
• Identify, design, and implement internal process improvements: automating manual data processes, optimizing data delivery, etc.
• Ensure our data is separated and secure across national boundaries through multiple data centers and AWS regions.
Requirements and Skills
• You should have a bachelors or master’s degree in computer science, Information Technology or other quantitative fields
• You should have at least 5 years working as a data engineer in supporting large data transformation initiatives related to machine learning, with experience in building and optimizing pipelines and data sets
• Strong analytic skills related to working with unstructured datasets.
• Experience with Azure cloud services: ADF, Azure Synapse, Blob Storage, ADLS, App Insights, and familiarity with various log formats from Azure.
• Experience with object-oriented/object function scripting languages: Python, Py-spark, Java, etc.
• Experience with big data tools: Hadoop, Spark, Kafka, etc.
• Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
• You should be a good team player and committed for the success of team and overall project.
Requirements
Responsibilities
· Create and maintain optimal data pipeline architecture; assemble large, complex data sets that meet functional / non-functional requirements.
· Design the right schema to support the functional requirement and consumption patter.
· Design and build production data pipelines from ingestion to consumption.
· Build the necessary datamarts, data warehouse required for optimal extraction, transformation, and loading of data from a wide variety of data sources.
· Create necessary preprocessing and postprocessing for various forms of data for training/ retraining and inference ingestions as required.
· Create data visualization and business intelligence tools for stakeholders and data scientists for necessary business/ solution insights.
· Identify, design, and implement internal process improvements: automating manual data processes, optimizing data delivery, etc.
· Ensure our data is separated and secure across national boundaries through multiple data centers and AWS regions.
Requirements and Skills
· You should have a bachelors or master’s degree in computer science, Information Technology or other quantitative fields
· You should have at least 5 years working as a data engineer in supporting large data transformation initiatives related to machine learning, with experience in building and optimizing pipelines and data sets
· Strong analytic skills related to working with unstructured datasets.
· Experience with Azure cloud services: ADF, Azure Synapse, Blob Storage, ADLS, App Insights, and familiarity with various log formats from Azure.
· Experience with object-oriented/object function scripting languages: Python, Py-spark, Java, etc.
· Experience with big data tools: Hadoop, Spark, Kafka, etc.
· Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
· You should be a good team player and committed for the success of team and overall project.
About the Company
Cigres Technologies Private Limited is a technology consulting and services company that focuses on helping clients resolve their significant digital problems and enabling radical digital transformation using multiple technologies on premise or in the cloud. The company was founded with the goal of leveraging cutting-edge technology to deliver innovative solutions to clients across various industries.