Data & Automation Lead

Remote
Full Time
Mid Level

Our client is looking for a data and automation lead to help find, cleanse, ingest and
automate data used in the product and process. This person would use our
existing and new data sets to model and prototype how it could be used in our data model and
product, create and test ideas to clean the data, and work with our product and engineering
team to automate, collect, clean and ingest into the product. This person would work with our
customer account team to understand what data our customers need and then find ways
including using, scrapping, LLMs and other tools to find and aggregate the data. After prototype
the collection and aggregate tool the team would work to determine how we automate the
functionality. This role is responsible for making sure the data feeding our products features is
clean, structured, and reliable.
—-
Role Description

1. Help envision, design, prototype, build and test the data models for use in our client's product and process
2. Work to find ways to collect, automate, structure and deliver our data across all aspects
of in our client's, including, but not limited to the Market Landscapes, Documents
Database and the City/Company/Product page suite.
3. Work with the product and customer success teams to find efficiencies in collecting,
organizing and structuring our existing data set.
4. Work to align mapping of new data sets to our existing company, product, and
government data hierarchies.
5. Work with customer success team to identify new sources and acquisition methods for
data based on customer needs
6. Use scraping AI, ML, LLMs and other emerging technologies to create proof of concepts,
refine them and then work to incorporate them into the product
7. Operationalizing disparate PO Data sources: Python ETL automation using
Pandas/regex/SQL to consolidate multi-year PO data, apply deduplication, cross-
reference column reseller mappings, and replace complex Excel workflows with scalable
pipelines.
8. Entity extraction & content ML scoring: Implementing NER, fuzzy matching, and
supervised models trained on labeled data POs to classify match confidence and
continuously improve accuracy.
9. Work with product and engineering teams to find the best and more efficient ways to add
the automation and data to our product
10. Use and build technical skills to help manage data throughout its lifecycle working with
the engineering team to implement them into the product.

Desired Skills
● Proficiency in data mining, wrangling, and cleaning of large-scale datasets
● Advanced Excel combined with Python (Pandas/NumPy) a plus
● Experience maintaining and deploying ML models (Transformers, Pattern Recognition)
and handling model persistence (.pkl)
● Knowledge or abilities with APIs (OpenAI, Gemini, Hubspot, Slack),
● Experience or interest in using AI, LLM and other emerging technologies


 

Additional Job Details

This position is fully remote and open to candidates in Mexico and Latin America.

Share

Apply for this position

Required*
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*