hero

Join our mission

Discover cybersecurity career opportunities across our portfolio

Machine Learning Data Engineer

GetReal Labs

GetReal Labs

Full-time
Austin, TX, USA · Atlanta, GA, USA
135,000 – 165,000 USD per year
Posted on Jul 16, 2024

Company Overview:

GetReal Labs is the world’s leading authority on malicious manipulated content and deepfake detection, offering advanced solutions to authenticate content and combat deception. Our technology serves corporations, financial institutions, media organizations, and government agencies. Incubated by Ballistic Ventures, the venture capital firm dedicated exclusively to funding and incubating entrepreneurs and innovations in cybersecurity, GetReal Labs was co-founded by Hany Farid, the go-to source for media forensics.

Role Overview:

As a Machine Learning Data Engineer at GetReal Labs, you will play a pivotal role in the development and implementation of data collection strategies for large-scale image, video, and audio datasets. Your primary focus will be on designing and executing efficient and scalable data collection pipelines, ensuring high-quality data for machine learning model training and validation. If you are passionate about leveraging cutting-edge technologies to gather diverse datasets and have a keen interest in machine learning, this is an exciting opportunity to make a significant impact.

Key Responsibilities:

  • Collaboration: Collaborate with cross-functional teams including data scientists, machine learning engineers, and domain experts to understand data requirements and objectives.
  • Data Strategy: Design and develop data collection strategies for large-scale image, video, and audio datasets, considering factors such as diversity, quality, and representativeness.
  • Pipeline Development: Design, develop, and maintain robust data pipelines to collect, store, and process large volumes of data efficiently and reliably.
  • Automation: Develop and implement automation tools to streamline data collection, processing, and curation tasks.
  • Infrastructure Management: Oversee the infrastructure required for data storage and processing, ensuring scalability and performance.
  • Data Curation: Curate and manage datasets, ensuring they are clean, well-organized, and suitable for training and testing ML models.
  • Data Quality Assurance: Implement data validation and quality assurance processes to ensure the integrity and accuracy of datasets.
  • Documentation: Document processes, methodologies, and best practices related to data collection and management.
  • Innovation: Stay up-to-date with the latest advancements in data collection, machine learning, and related fields, contributing insights and ideas to the team.

Qualifications:

  • Education: Bachelor's or Master’s degree in Computer Science, or a related field.
  • Experience: 3+ years of experience in designing and implementing data collection pipelines for image, video, or audio datasets.
  • Skills:
    • Strong programming skills in languages such as Python or Java.
    • Familiarity with machine learning concepts and frameworks (e.g., TensorFlow, PyTorch).
    • Experience with data preprocessing, cleaning, and transformation techniques.
    • Proficiency in using databases and data storage solutions (e.g., SQL, NoSQL, Hadoop).
    • Knowledge of cloud computing platforms (e.g., AWS, Azure, Google Cloud) and their services.
    • Excellent problem-solving skills and attention to detail.
    • Effective communication skills with the ability to work collaboratively in a team environment.
    • Experience with distributed computing and big data processing is a plus.
    • Background in computer vision, natural language processing, or audio processing is a plus.

Compensation and Benefits:

  • Competitive salary and stock options
  • Comprehensive health, dental, and vision insurance
  • Generous paid time off and company holidays
  • Professional development opportunities
  • Hybrid work environment, with 3 days per week expected in office in Atlanta, GA or Austin, TX

The salary range for this position is $135,000 to $165,000 depending on the candidate’s skills, experience, and qualifications. In addition to cash compensation, this role is eligible for a stock option grant.

GetReal Labs is an equal opportunity employer.

Apply for this job

Drag and drop or click to upload.
No
No
Tell us why you are a good fit, add a cover letter or anything else you want to share.
To withdraw or update your application, email applications@getro.com