Data engineer (senior) id30242

Campinas

Jobfinder Spain

Anunciada dia 8 março

Descrição

AgileEngine is one of the Inc. 5000 fastest-growing companies in the U.S. and a top-3 ranked dev shop according to Clutch.
We create award-winning custom software solutions that help companies across 15+ industries change the lives of millions.If you like a challenging environment where you're working with the best and are encouraged to learn and experiment every day, there's no better place - guaranteed!What you will doWeb Scraping & Data Extraction: design, develop, and optimize web scraping strategies for large-scale data extraction from dynamic websites; identify and assess relevant data sources, ensuring alignment with business objectives; implement automated web scraping solutions using Python and libraries like Scrapy, BeautifulSoup, and Selenium; build resilient and adaptable scrapers that can handle website structure changes, rate limits, and anti-scraping measures.Data Processing & Integration: cleanse, validate, and transform extracted data to ensure accuracy, consistency, and usability; store and manage large volumes of scraped data using best-in-class storage solutions; develop ETL pipelines to integrate scraped data into data warehouses and analytics platforms; collaborate with cross-functional teams, including data scientists and engineers, to make scraped data actionable.Web Scraping & Optimization: optimize scraping procedures to improve efficiency, reliability, and scalability across multiple data sources; implement solutions for bypassing CAPTCHAs, rotating user agents, and managing proxy services; continuously monitor, troubleshoot, and maintain scraping scripts to minimize disruptions due to site changes.Compliance & Documentation: stay up to date with legal, ethical, and compliance considerations related to web scraping and data collection; ensure data collection processes align with best practices and regulatory requirements; maintain clear and detailed documentation of scraping methodologies, data pipelines, and best practices.Must haves5+ years of hands-on experience in web scraping, data extraction, and integration;Strong proficiency in Python and web scraping frameworks (Scrapy, BeautifulSoup, Selenium);Experience working with large-scale data storage solutions and optimizing retrieval performance;Strong grasp of ETL processes, data pipelines, and data warehousing;Familiarity with APIs for data extraction and integration from public and restricted sources;Strong problem-solving skills with an ability to debug and adapt to changing web structures;Solid understanding of web scraping ethics, legal implications, and compliance guidelines;Nice to havesBachelor's degree in Computer Science, Data Science, Information Technology, or a related field;Experience with cloud-based distributed scraping systems (AWS, GCP, Azure);Knowledge of big data frameworks and experience handling high-volume datasets within Snowflake;Familiarity with machine learning techniques for data extraction and natural language processing (NLP);Experience working with JSON, XML, CSV, and other structured data formats;Proficiency with version control systems (Git).The benefits of joining usProfessional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps.Competitive compensation: We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities.A selection of exciting projects: Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands.Flextime: Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive.Next Steps After You ApplyThe next steps of your journey will be shared via email within a few hours.
Please check your inbox regularly and watch for updates from our Internal Applicant site, LaunchPod, which will guide you through the process.
#J-18808-Ljbffr

Se candidatar

Criar um alerta

Salvar