Job Details

Data Engineer - Data Science Platforms & Infrastructure

S&P Global
Richmond, Virginia, United States
Panjiva is a data-driven technology company that uses machine learning to provide powerful search, analysis, and visualization of billions of shipping records from nearly every country in the world. More than 3,000 customers in over 100 countries, ranging from Fortune 500 companies and startups to government agencies and hedge funds, rely on our platform for supply chain intelligence. In global trade, better insight means better decision making and stronger connections between companies and governments across the globe.Recognizing Panjiva's cutting-edge technology, S Global acquired Panjiva in 2018. This acquisition has grown our resources, dramatically expanded our access to data, and accelerated our growth plans.**People are Panjiva's greatest strength - join our engineering team as we map out a key part of the world economy!****Job Description**As a data engineer on our team, you will play a key role in developing our next-generation data science infrastructure and underlying core technologies. You will work with Panjiva's world-class data scientists, analysts, and engineers to create products that solve important real-world business problems in a collaborative, fast-paced, and fun environment.You'll work closely with our data science team to develop new platforms, infrastructure, and tools that will allow for machine learning applications at production scale over ever-growing datasets.You'll design and leverage distributed computing technologies, data schemas, and APIs to construct data science pipelines. In addition, you'll be expected to participate in augmenting our infrastructure to seamlessly integrate new data sets through constant R of the technologies and systems we use.Join us in building the next generation of products as we continue to deliver valuable and actionable insights to decision-makers in the $15 trillion global trade industry.**Responsibilities**+ Architect and implement distributed systems that perform complex transformations, processing, and analysis over very large scale datasets+ Develop processes to monitor and automate detection of quality regressions in raw data or in the output of Panjiva's machine learning models+ Working with our data scientists to turn large-scale messy, diverse, and often unstructured data into a source of meaningful insights for our customers+ Optimizing slow-running database queries and data pipelines+ Helping enhance our search engine, capable of running sophisticated user queries quickly and efficiently+ Building internal tools and backend services to enable our data scientists and product engineers to improve efficiency**Compensation/Benefits Information:**S Global states that the anticipated base salary range for this position is $91,500 to $190,100. Base salary ranges may vary by geographic location.This role is eligible to receive S Global benefits. For more information on the benefits we provide to our employees, visit _ _._**Qualifications**+ B.S., M.S., or Ph.D. in Computer Science (or a related field) or equivalent work experience+ 3+ years of experience working with data-at-scale in a production environment+ Experience designing and implementing large-scale, distributed systems+ Experience in multi-threaded software development (or _some_ form of parallelism)+ Significant performance engineering experience (e.g., profiling slow code, understanding complicated query plans, etc.)+ Solid understanding of core algorithms and data structures, including the ability to select (and apply) the optimal ones to computationally expensive operations over data-at-scale+ Strong understanding of relational databases and proficiency with SQL+ Deep knowledge of at least one scripting language (e.g., Python, Ruby, JavaScript)+ Deep knowledge of at least one compiled language (e.g., Scala, C++, Java, Go)+ Experience developing software on Linux-based operating systems+ Experience with distributed version control systems**Nice-to-Haves**+ Familiarity with relational database _internals_ (especially PostgreSQL)+ Proficiency with cloud computing platforms, specifically AWS+ Working knowledge of probability & statistics+ Contributions to open-source software+ Experience building customer-centric products**Grade** ( _relevant for internal applicants only_ ): 11S Global is an equal opportunity employer committed to making all employment decisions without regard to race/ethnicity, sex, pregnancy, gender identity or expression, color, creed, religion, national origin, age, disability, marital status (including domestic partnerships and civil unions), sexual orientation, military veteran status, unemployment status, or any other basis prohibited by federal, state or local law. Only electronic job submissions will be considered for employment.If you need an accommodation during the application process due to a disability, please send an email to: **** and your request will be forwarded to the appropriate person.The EEO is the Law Poster describes discrimination protections under federal law.20 - Professional (EEO-2 Job Categories-United States of America), IFTECH202.2 - Middle Professional Tier II (EEO Job Group), SWP Priority - Ratings - (Strategic Workforce Planning)**Job ID:** 257140**Posted On:** 2021-03-25**Location:** Cambridge, Massachusetts, United States

Send Application

Mail this job to me so I can apply later

Apply With CV

You are not logged in. If you have an account, log in to your account. If you do not have an account, why not sign up? It only takes a minute!