Spark¤j¼Æ¾Ú§Þ³N¶µ¥Ø¹ê¾Ô(·s¤@¥N«H®§§Þ³N¨t¦C±Ð§÷)
¤º®e¤j¿û
¥»®Ñ¥Dn³ò¶¤j¼Æ¾Ú³B²z§Þ³NSpark®i¶}Á¿¸Ñ¡A¦®¦b¤Þ¾ÉŪªÌ²`¤J¤F¸Ñ¤j¼Æ¾Ú¤ÀªR³B²zªº¥þ¬yµ{¡A¨ÃåªR¨CÓÀô¸`¤¤©Ò¨Ï¥ÎªºÃöÁä§Þ³N¤Î¨äì²z¡C
¥þ®Ñ¦@¤KÓ¹ê¾Ô¶µ¥Ø¡C¶µ¥Ø¤@¤¶²Ð¤F¦p¦ó·f«Ø¤@Óéw¥B°ª®ÄªºSpark¶°¸sÀô¹Ò¡A±´°Q¤FSparkªº°ò¥»·§©À¡B¯SÂI¤ÎÀ³¥Î³õ´º¡A¦P®É»PHadoop¶i¦æ¤F¹ï¤ñ¤ÀªR¡C¶µ¥Ø¤G³q¹L¹ê²{¤@Ó§¹¾ãªº¤H¨ÆºÞ²z¨t²Î¡A¤¶²Ð¤FScala»y¨¥ªº°ò¦»yªk»P±¦V¹ï¶H½sµ{¤Î¨ç¼Æ¦¡½sµ{ªº·§©À¡A¥Ü½d¤F¦p¦ó¨Ï¥ÎScala¶i¦æSparkÀ³¥Î¶}µo¡C¶µ¥Ø¤T¦Ü¶µ¥Ø¤C¹B¥ÎSpark¤À§O¹ï¹q°Ó¥Î¤á¦æ¬°¼Æ¾Ú¡B¹q¼v¼Æ¾Ú¡B»È¦æ«È¤á¼Æ¾Ú¡B³]³Æ¬G»Ù¼Æ¾Ú¥H¤ÎªÀ¥æ´CÅéµû½×¼Æ¾Ú¶i¦æ¤F¼Æ¾Ú¤ÀªR»P³B²z¡A¤º®e²[»\±q¼Æ¾Ú¹w³B²z¨ì°ª¯Å²Îp¤ÀªRªº¥þ¹Lµ{¡C¶µ¥Ø¤K³q¹L¤@Óºî¦X©Êªº®×¨Ò¡X¡X°ò©óSpark MLlibªº¼s§iÂIÀ»²v¹w´ú¡A±N«e±©Ò¾Çªºª¾Ãѿķ|³e³q¡A³v¨B±a»âŪªÌ§¹¦¨¤j¼Æ¾Ú¶}µoªº®Ö¤ß¬yµ{¡A¥]¬A¼Æ¾Ú¹w³B²z¡B¯S¼x¤uµ{¡B¼Ò«¬°V½m»Pµû¦ôµ¥¨BÆJ¡C¥»®Ñ¤£¶È´£¨Ñ¤FÂ×´Iªº²z½×ª¾ÃÑ¡AÁÙ»²¥H¤j¶q¹ê¾Ô®×¨Ò¡A¦®¦bÀ°§UŪªÌ¥þ±´x´¤Spark¤j¼Æ¾Ú§Þ³Nªº¹ê»ÚÀ³¥Î¡C
¥»®Ñ¥i§@¬°°ªµ¥°|®Õ¹q¸£¬ÛÃö±M·~ªº±Ð§÷¡A¤]¥i§@¬°¹q¸£»â°ì§Þ³N¤Hû¤Î½sµ{·R¦nªÌªº°Ñ¦Ò®Ñ¡C
§@ªÌ¤¶²Ð
½sªÌ:¾H¥Ã¥Í//§õÄR//±i«T»¨|³d½s:°ªÄå
¥Ø¿ý
¶µ¥Ø¤@ ·f«ØSpark¶°¸sÀô¹Ò
¥ô°È1.1 »{ÃÑSpark
1.1.1 Spark·§z
1.1.2 Sparkªº¯SÂI
1.1.3 SparkªºÀ³¥Î³õ´º
1.1.4 Spark©MHadoop¹ï¤ñ
¥ô°È1.2 ·f«ØSpark¶°¸s
1.2.1 ¦w¸Ë·Ç³Æ
1.2.2 Sparkªº³¡¸p¤è¦¡
1.2.3 Spark¶°¸sªº¦w¸Ë»P³¡¸p
¥ô°È1.3 Spark¹B¦æ¬[ºc»Pì²z
1.3.1 Spark¶°¸sªº¹B¦æ¬[ºc
1.3.2 Spark¹B¦æªº°ò¥»ì²z
³Ð·s¾Ç²ß
¯à¤O´ú¸Õ
¶µ¥Ø¤G ¨Ï¥ÎScala¹ê²{¤H¨ÆºÞ²z¨t²Î
¥ô°È2.1 ·f«ØScala¶}µoÀô¹Ò
2.1.1 Scala²¤¶
2.1.2 ·f«ØScala¶}µoÀô¹Ò
2.1.3 Scala¥N½Xªº¹B¦æ¤è¦¡
¥ô°È2.2 ¾Ç²ßScala°ò¦»yªk
2.2.1 °ò¥»»yªk©Mµ²ºc
2.2.2 ¼Æ¾ÚÃþ«¬©M¾Þ§@
2.2.3 ±¦V¹ï¶H½sµ{
2.2.4 ¨ç¼Æ¦¡½sµ{
2.2.5 ¿é¤J¿é¥X©M²§±`³B²z
2.2.6 °ª¯Å¯S©Ê
¥ô°È2.3 ¹ê²{¤H¨ÆºÞ²z¨t²Î
2.3.1 ¤H¨ÆºÞ²z¨t²Î»Ý¨D¤¶²Ð
2.3.2 ¨t²Î¬[ºc»P§Þ³N³]p
2.3.3 »Ý¨D¥\¯à¹ê²{
2.3.4 ½sĶ»P¹B¦æ
2.3.5 ¥N½XÀu¤Æ
³Ð·s¾Ç²ß
¯à¤O´ú¸Õ
¶µ¥Ø¤T ¹q°Ó¥Î¤á¦æ¬°¼Æ¾Ú¤ÀªR
¥ô°È3.1 »{ÃÑRDD
3.1.1 RDDªº·§©À
3.1.2 RDDªº¯SÂI
3.1.3 RDD¾Þ§@ªº¤ÀÃþ
¥ô°È3.2 RDD¾Þ§@¹ê½î
3.2.1 Spark ShellÀô¹Ò¹ê¾Þ
3.2.2 ³Ð«ØRDDªº¤è¦¡
3.2.3 ±`¥ÎÂà´«¾Þ§@¹ê½î
3.2.4 ±`¥Î¦æ°Ê¾Þ§@¹ê½î
¥ô°È3.3 ¨Ï¥ÎRDD¹ê²{¹q°Ó¥Î¤á¦æ¬°¤ÀªR
3.3.1 ¹q°Ó¥Î¤á¦æ¬°¼Æ¾Ú²¤¶
3.3.2 ¥\¯à»Ý¨D¤ÀªR
3.3.3 »Ý¨D¹ê²{«ä¸ô¤ÀªR
3.3.4 ¼Æ¾Ú¹w³B²z
3.3.5 »Ý¨D¥\¯à¹ê²{
³Ð·s¾Ç²ß
¯à¤O´ú¸Õ
¶µ¥Ø¥| ¹q¼v¼Æ¾Ú¤ÀªR¹ê²{
¥ô°È4.1 ·f«ØSpark¶}µoÀô¹Ò
4.1.1 IntelliJ IDEA¤¶²Ð©M¦w¸Ë
4.1.2 Zeppelinªº¦w¸Ë©M°ò¥»¨Ï¥Î
¥ô°È4.2 ½s¼g²Ä¤@ÓSparkµ{§Ç
4.2.1 ½sµ{¼Ò«¬¤¶²Ð
4.2.2 Spark WordCount®×¨Ò¤ÀªR
4.2.3 Spark WordCount¥N½X¹ê²{
¥ô°È4.3 ¥´¥]¨Ã¹B¦æSparkµ{§Ç
4.3.1 ¥´¥]´¡¥ó¤¶²Ð
4.3.2 ¥´¥]µ{§Ç¹ê¾Þ
4.3.3 ´£¥æSparkµ{§Ç¨ì¶°¸s¹B¦æ
¥ô°È4.4 ½sµ{¹ê²{¹q¼v¼Æ¾Ú¤ÀªR
4.4.1 ¶µ¥ØI´º
4.4.2 ¼Æ¾Ú´yz
4.4.3 ¥\¯à»Ý¨D
4.4.4 »Ý¨D¹ê²{
³Ð·s¾Ç²ß
¯à¤O´ú¸Õ
¶µ¥Ø¤ »È¦æ«È¤á¼Æ¾Ú¤ÀªR
¥ô°È5.1 »{ÃÑSpark SQL
5.1.1 Spark SQL·§z
5.1.2 ¼Æ¾Úªí¥Ü»P³B²z
5.1.3 SQL¬d¸ß»PÀu¤Æ
¥ô°È5.2 Spark SQL°ò¦
5.2.1 DataFrame API°ò¦¾Þ§@
5.2.2 ¼Æ¾Ú·½©M®æ¦¡
¥ô°È5.3 Spark SQL¶i¶¥¾Þ§@
5.3.1 °ª¯Å¾Þ§@»P¥\¯à
5.3.2 ©Ê¯àÀu¤Æ»P½ÕÀu
5.3.3 ÂX®i»P¾ã¦X
¥ô°È5.4 ¤ÀªR»P²Îp»È¦æ«È¤á¼Æ¾Ú
5.4.1 »È¦æ«È¤á¼Æ¾Ú²¤¶
5.4.2 ¼Æ¾Ú¹w³B²z©M·Ç³Æ
5.4.3 ¼Æ¾Ú±´¯Á»P¤ÀªR
5.4.4 «È¤á¦æ¬°¤ÀªR
³Ð·s¾Ç²ß
¯à¤O´ú¸Õ
¶µ¥Ø¤» ³]³Æ¬G»Ùªº¹ê®ÉºÊ±±
¥ô°È6.1 »{ÃÑStructured Streaming
6.1.1 µ²ºc¤Æ¬y³B²z·§z
6.1.2 ¼Æ¾Ú·½©M¼Æ¾Ú±µ¦¬¾¹
6.1.3 ¹ê®É¼Æ¾Ú³B²z©M¿é¥X
¥ô°È6.2 ¼ÒÀÀ¥Í¦¨³]³Æ¼Æ¾Ú
6.2.1 ³]³Æ¼Æ¾Ú¥Í¦¨¤u¨ã
6.2.2 ³]³Æ¼Æ¾Ú¬y³B²z
¥ô°È6.3 ¹ê²{³]³Æ¬G»Ùªº¹ê®ÉºÊ±±
6.3.1 ³]³Æ¬G»ÙºÊ±±¨t²Î¬[ºc
6.3.2 ³]³Æ¬G»Ù¹ê®ÉºÊ±±³B²z
³Ð·s¾Ç²ß
¯à¤O´ú¸Õ
¶µ¥Ø¤C ªÀ¥æ´CÅéµû½×±¡·P¤ÀªR
¥ô°È7.1 ¤F¸ÑSpark MLlib
7.1.1 Spark MLlib·§z
7.1.2 ¾÷¾¹¾Ç²ß¤u§@¬yµ{
7.1.3 ©Ð²£¼Æ¾Ú³B²z»P¿é¥X
¥ô°È7.2 ¼Æ¾Ú³B²z»P¼Ò«¬À³¥Î
7.2.1 ¼Æ¾Ú¦¬¶°»P·Ç³Æ
7.2.2 ¯S¼x¤uµ{»P¼Ò«¬°V½m
7.2.3 ¼Ò«¬µû¦ô»P³¡¸p
¥ô°È7.3 ¹ïªÀ¥æ´CÅéµû½×¼Æ¾Ú¶i¦æ±¡·P¤ÀªR
7.3.1 ªÀ¥æ´CÅéµû½×¼Æ¾Ú·§z
7.3.2 ¼Æ¾Ú¦¬¶°»P¹w³B²z
7.3.3 ±¡·P¤ÀªR¼Ò«¬°V½m»Pµû¦ô
7.3.4 ±¡·P¤ÀªRµ²ªG®i¥Ü
³Ð·s¾Ç²ß
¯à¤O´ú¸Õ
¶µ¥Ø¤K °ò©óSpark MLlibªº¼s§iÂIÀ»²v¹w´ú
¥ô°È8.1 ¶µ¥Ø¤¶²Ð
8.1.1 ¶µ¥ØI´º
8.1.2 ¶µ¥Ø¥ô°È
8.1.3 ¶µ¥Ø¹ê¬I¬yµ{
¥ô°È8.2 ·Ç³Æ¼Æ¾Ú¶°
¥ô°È8.3 ¼Æ¾Ú¹w³B²z
¥ô°È8.4 ¯S¼x¤uµ{¹ê²{
¥ô°È8.5 ¼Ò«¬°V½m»