Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- big data
- Volume
- Velocity
- Value
- Variety
- Veracity
- Structure, semi-structure and unstructured data
- Organized(rdbms), partialy organized, no formal structure(xml, json) and unorganized no dsta schema(multimedia files)
- Difference between hadoop and rdbms
- rdbms relies on structure data, provides limited or no processing reads fast schema is already known, suitable for transaccion processing
- hadoop can store any kind of data, process data in distributeed parallel fashion, writes fast, suitable for online analytical processing
- Components of hadoop?
- HDFS - > YARN (yet another resource negotiator)
- NameNode -> Resource Manager (master)
- DataNode -> Node Manager (slave)
- Main hadoop configuration files
- hadoop-env.sh
- core-site.xml
- hdfs-site.xml
- yarn-site.xml
- mapred-site.xml
- masters
- slaves
- What is the problem of having lots of small files? How it can be solved?
- I have always enjoyed working with Java related technologies, I have plenty of Java development and architecture experience, including 3 Java Certifications
- Quality is the core of everything I do
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement