In this role, you will be responsible for building Hadoop-based systems that power the data pipeline for multiple key features of our large geospatial data. You’ll develop algorithms to match, Conflate, identify anomalies, as well as improve simplicity, scale, and efficiency of our systems to handle more sources and map features, as well as improve the freshness of our maps
Responsibilities
Identify requirements of new features, and propose design and solution
Implement features in suitable programming language
Take ownership of delivering features and improvements on time
Job Requirement
Able to wear multiple hats, do what it takes ability and attitude
Strong object-oriented programing and design skills, Preferably in Java/Scala
Experience building scalable, reliable, distributed Unix-based systems
Fluency in Hadoop technologies, e.g. YARN, Spark, Storm, Samza
Experience in information retrieval and storage or machine learning
Fluency with pipeline/workflow technologies, such as Oozie, Azkaban
Excellent analytical and problem solving skills
Excellent oral and written communication skills.
Merit Qualifications
Kafka
NoSQL Databases
Python
Excellence with GIS, Geographical Data and Toolkits such as JTS, ArcGIS, QGIS, Open jump etc.