Data Scientist
Job Summary
The role holder is responsible for applying data mining
techniques, doing statistical analysis, building high quality prediction
algorithms, developing analytical reports and devising analytical solutions to
use cases and data science problems. This will involve the ability to create
sophisticated, value-added analytic systems that support revenue generation,
risk management, operational efficiency, regulatory compliance, portfolio
management, and research.
Job Description
Key accountabilities/Deliverables/Outcomes.
- Perform
statistical analysis, deploying models on large data sets.
- Conduct
exploratory data analysis
- Demonstrate
a strong understanding of agile delivery.
- Develop
code with Spark via PySpark or SparkR
- Perform
queries, aggregations, joins, and transformations using Spark, Hive, and
Pig.
- Develop
new data sets using feature-engineering techniques.
- Deliver
value by creating functions, classes, and packages to automate processes
and workflows for production deployment.
- Evaluates
user request for new/modified programs to determine feasibility, cost and
time required, compatibility with current system, and computer
capabilities.
- Transform
large, complex datasets into pragmatic, actionable insights.
- Leverage
data to identify, quantify and influence tangible business gain
- Implement
analytical model designs, perform any restructuring required, and review
dataset implementations performed by the data engineer and BI developers.
- Selecting
features, building and optimizing classifiers using machine learning
techniques
- Data
mining using bank selected data mining tools
- Enhance
data collection procedures to include information that is relevant for
building analytic systems
- Processing,
cleansing, and verifying the integrity of data used for advanced analysis
- Doing
ad-hoc analysis and presenting results in reports, dashboards and charts
- Creating
automated anomaly detection systems and constant tracking of its
performance
- Implement
statistical data quality procedures or test-driven approach for quality
assurance
- Challenge
ideas and methods while working together with talented, highly skilled
team members.
- Design,
create, interpret and manage large datasets to achieve business goals
- Design,
build, and maintain various parts of the data warehousing with respect to
requirements gathering, data modelling, metric establishment, reporting
production, and data visualization.
- Gather
and process raw, unstructured data at scale into a form suitable for
analysis then consolidate into the data warehouse in order to perform
Business Intelligence and advanced analytics.
- Evaluate
datasets for accuracy and quality using statistical data quality
procedures, software, or test-driven approaches that ensure quality
assurance and solve any issues, which may arise.
- Improve
data foundational procedures, guidelines and standards and develop best
practices for data management, maintenance, reporting and security.
- Conduct
performance tuning to be able to optimize the application of statistical
models and scripts
- Develop
and maintain documentation/manuals on models developed, reports generated
and statistical solutions devised.
- Carry
out technical user training as required to enable users interpret Data
Science solutions
- Ability
to take personal responsibility and accountability for timely response to
client queries, requests or needs, working to remove obstacles that may
impede execution or overall success.
- Assist
in developing and implementing a program of continuous improvement of Data
processes through a cycle of analysis of existing systems, processes, and
tools, identifying areas for improvement, and implementing high-impact
changes, and getting feedback from stakeholders.
- Understand
Key Performance Measures and Indicators that drive company performance
measurement, reporting, and analytics across functions and understand how
these metrics and measures align and track against overall business
strategies, goals and objectives.
- Work
with Business Customers to understand business requirements and implement
solutions and with business owners to develop key business questions and
to build datasets that answer those questions.
- Assist
to analyze business/use case requirements from BI analysts to determine
operational problems, define data modeling requirements, gather and
validate information, apply judgment and statistical tests and develop
data structures to support the generation of business insights and
strategy;
- Provide
test interfaces for users to test the reports and dashboards before being
put on the production environment.
Role/person specification
Qualification
- Bachelor’s
degree in mathematics/statistics, data sciences or related quantitative
fields.
Preferred Experience
- 1-3
years Technical experience in data science
Knowledge and Skills
- Data-oriented
personality
- Knowledge
of agile software development process and performance metric tools
- Experience
extracting and cleaning text in different formats e.g. HTML, pdf files
- Proven
ability to collaborate with other team members across boundaries and
contribute productively to the team’s work and output, demonstrating
respect for different points of view. Able to use strong interpersonal and
teamwork skills to cultivate effective, productive client relationships
and partnerships across organizational boundaries.
- Knowledge
on the Hadoop Data Platform and using Scala for big data analysis
- Proficient
at queries, report writing and presenting findings
- Knowledge
of ETL and data integration tools
- Knowledge
of merging technological trends in programming languages and other
programming tools