第1题:
What are the four basic Data Flow steps of an ETL process?
在ETL过程中四个基本的过程分别是什么?
第2题:
At which stage of the ETL should data be profiled?
简述应该在ETL的哪个步骤来实现概况分析?
第3题:
A.Sqoop
B.Bigtop
C.Autoconf
D.Oracle
第4题:
What steps do you take to determine the bottleneck of a slow running ETL process?
如果ETL进程运行较慢,需要分哪几步去找到ETL系统的瓶颈问题。
第5题:
Why do dates require special treatment during the ETL process?
为什么在ETL的过程中需要对日期进行特殊处理?
第6题:
When should data be set to disk for safekeeping during the ETL?
简述ETL过程中哪个步骤应该出于安全的考虑将数据写到磁盘上?
第7题:
What are the essential deliverables of the data quality portion of ETL?
ETL项目中的数据质量部分核心的交付物有那些?
第8题:
What is a logical data mapping and what does it mean to the ETL team?
什么是逻辑数据映射?它对ETL项目组的作用是什么?
第9题:
Name the three fundamental fact grains and describe an ETL approach for each.
简述三种基本事实表,并说明ETL的过程中如何处理它们。
第10题:
Describe how to estimate the load time of a large ETL job.
Real Time ETL
简述如何评估大型ETL数据加载时间。