Renato Pedroso Neto
Verified Expert in Engineering
Data Engineer and Developer
Renato拥有超过13年的大数据项目经验. 他曾任职于Databricks、Capco和金融机构. Renato已经将数拍字节的数据迁移到本地和云数据湖环境, architected entire lakehouses, 实施机器学习模型,为客户提供智能建议,管理多元文化数据团队,为巴西一流银行提供数据项目. He has a master's degree in big data.
Portfolio
Experience
Availability
Preferred Environment
Spark, Databricks, Python, Amazon Web Services (AWS), Google Cloud Platform (GCP), Machine Learning, Big Data, Amazon Elastic MapReduce (EMR), SQL, Amazon RDS
The most amazing...
...这个项目是为巴西的一个开放银行数据采集项目,该项目使用机器学习来保证质量,并为金融机构提供一个良好的数据源.
Work Experience
Delivery Solutions Architect
Databricks
- 通过分析客户数据并提出改进建议,将客户使用率提高了2倍.
- 对Spark环境进行压力测试,在28分钟内生成和散列1万亿行.
- 获得AWS解决方案架构师和Spark开发人员认证.
Data Engineer
Comniscient Technologies LLC dba Comlinkdata
- 开发电信市场数据和洞察平台的新指标, 使用Spark来帮助客户理解客户的行为.
- 帮助构建和发展一个产品,以检查网络运营商在一个国家的竞争力.
- 在Airflow中实现了使用Spark转换电信数据的新dag.
Data Engineer
An Online Freelance Agency
- Worked with a client to architect, construct, 并支持从内部部署到云环境的数据管道.
- Rearchitected the client's data pipeline in the cloud, reducing the total cost of ownership (TCO) by 40%.
- 提供Python代码方面的咨询,包括一般指导和最佳实践.
Lead Data Engineer | Architect | Scientist
Capco
- 标准化数据实践,并将其作为Capco官方产品发布.
- 拥有Capco咨询和创新实验室的所有数据项目.
- 领导开放式银行数据采集和标准化的开发,直接交付给金融机构.
- 为金融机构创建并调整了一个自然语言模型.
- 为Capco的客户开发市场数据管道.
Big Data Systems Engineer
Banco Itaú
- 将10PB的数据从大型机迁移到Hadoop环境,创建可靠的数据管道.
- Delivered 99.HDFS环境下99%的数据可用性.
- 为整个银行建立了一个信息中心.
- 使用Spark的制度化并行处理,为业务领域提供快速的结果.
Experience
Open Banking Data Ingestion
Financial Data Web Scraping
Beacon Data Analysis
Monolith Decomposition
Sentiment Analysis for Financial Institutions
Mainframe to Big Data Environment Engineering
Education
Specialization in Data Science
约翰霍普金斯大学|通过Coursera -巴西圣保罗
Master's Degree in Big Data
巴西圣保罗Informática行政
Bachelor's Degree in Computer Science
Mackenzie University - Sao Paulo, Brazil
Certifications
Databricks Certified Machine Learning Professional
Databricks
Databricks Certified Data Engineer Professional
Databricks
Databricks认证的Apache Spark 3助理开发人员.0
Databricks
AWS Certified Solutions Architect Associate
AWS
Machine Learning Engineer
Udacity
Data Science Specialization
Coursera
Getting and Cleaning Data
Coursera
Dell EMC Data Science Associate (EMCDSA)
Dell EMC
Linux Professional Institute 101 (LPIC-1)
Linux Professional Institute
Skills
Libraries/APIs
Spark Streaming, PySpark, Pandas, Scikit-learn, NumPy, Beautiful Soup, Selenium WebDriver
Tools
Git, Apache Airflow, Amazon Elastic MapReduce (EMR), Redash, BigQuery, Amazon Simple Queue Service (SQS), Amazon Transcribe, Amazon QuickSight, Amazon Athena, AWS Glue, Apache Maven
Frameworks
Spark, Apache Spark, Hadoop, Flask, Selenium, Scrapy
Paradigms
数据科学,ETL,商业智能(BI),逻辑编程
Languages
Python, SQL, COBOL, XPath, Scala, Snowflake
Platforms
Databricks, Amazon Web Services (AWS), Linux, Amazon EC2, Google Cloud Platform (GCP), Apache Kafka
Storage
Databases, Apache Hive, Data Pipelines, Redshift, Data Lakes, Amazon S3 (AWS S3), NoSQL, MySQL, Google Cloud Datastore, PostgreSQL, MongoDB, Redis
Other
Machine Learning, Big Data, Data Engineering, Data Warehousing, Data, Data Analysis, Data Analytics, ELT, Systems Analysis, Cloud, Stream Processing, Scraping, Data Scraping, Web Scraping, Predictive Modeling, Amazon RDS, Operating Systems, IT Systems Architecture, Neural Networks, Statistics, Deep Learning, Data Modeling, Mainframe, Data Architecture, Prototyping, People Management, Client Relationship Management, Delta Lake, Google Cloud Functions, Pub/Sub, Vertex, Apache Superset, Clustering, Reporting, Natural Language Processing (NLP), APIs, Message Queues, GPT, Generative Pre-trained Transformers (GPT), Processing & Threading
How to Work with Toptal
在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring