Advertisement
Guest User

Untitled

a guest
Jan 18th, 2018
84
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.08 KB | None | 0 0
  1. big data
  2. Volume
  3. Velocity
  4. Value
  5. Variety
  6. Veracity
  7.  
  8. Structure, semi-structure and unstructured data
  9. Organized(rdbms), partialy organized, no formal structure(xml, json) and unorganized no dsta schema(multimedia files)
  10.  
  11. Difference between hadoop and rdbms
  12. rdbms relies on structure data, provides limited or no processing reads fast schema is already known, suitable for transaccion processing
  13. hadoop can store any kind of data, process data in distributeed parallel fashion, writes fast, suitable for online analytical processing
  14.  
  15. Components of hadoop?
  16. HDFS - > YARN (yet another resource negotiator)
  17. NameNode -> Resource Manager (master)
  18. DataNode -> Node Manager (slave)
  19.  
  20. Main hadoop configuration files
  21. hadoop-env.sh
  22. core-site.xml
  23. hdfs-site.xml
  24. yarn-site.xml
  25. mapred-site.xml
  26. masters
  27. slaves
  28.  
  29.  
  30. What is the problem of having lots of small files? How it can be solved?
  31.  
  32.  
  33.  
  34.  
  35. I have always enjoyed working with Java related technologies, I have plenty of Java development and architecture experience, including 3 Java Certifications
  36. Quality is the core of everything I do
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement