Advertisement
Guest User

Content Extraction

a guest
Jul 15th, 2018
86
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
PHP 7.67 KB | None | 0 0
  1. --------Extract_helper.php-----------
  2. <?php
  3. /////Created by @anu
  4. ////For edgeryders
  5.  
  6.  
  7. $file = file_get_contents('Content.txt');
  8. $url = 'ftp://sunsite.informatik.rwth-aachen.de/pub/mirror/ibiblio/gutenberg/';
  9.  
  10. preg_match_all('/[\d]{4,5}/', $file, $matches);
  11.  
  12. $include_slash = preg_replace('/(\d)/', '${1}/', $matches[0]);
  13.  
  14. for ($i=0; $i < count($include_slash); $i++) {
  15.     $urls[$i] = $url . substr($include_slash[$i], 0, -2). $matches[0][$i];
  16. }
  17.  
  18.  
  19. foreach ($urls as $key => $url) {
  20.     echo '<a href="Format_content.php?url=' . $url . '/' . $matches[0][$key] . (strlen($matches[0][$key]) == 5 ? '-8' : '') . '.txt&project_code=' . $matches[0][$key] . '" target="_blanl">' . $url . '</a><br>';
  21. }
  22. ?>
  23.  
  24.  
  25.  
  26. ---------Format_content.php-------
  27. <?php
  28. error_reporting(0);
  29. $get_content = file_get_contents($_GET['url'], 0, NULL, 0, 1500);
  30.  
  31. preg_match('/Title: (.*)/', $get_content, $title);
  32. preg_match('/Release Date: (.*)/', $get_content, $release_date);
  33.  
  34. echo '[' . trim($title[1]) . '](<a href="' . $_GET['url'] . '" target="_blank">' . $_GET['url'] . '</a>) ' . $release_date[1] . '. Project Gutenberg text no. ' . $_GET['project_code'];
  35.  
  36. if(!isset($title[1])) {
  37.     echo '<h3>Errors?</h3>';
  38.     if (!isset($_GET['error'])) {
  39.         echo '<a href="?url=' . substr($_GET['url'], 0, -6) . '.txt&project_code=' . $_GET['project_code'] . '&error=1">Try method 1</a>';
  40.     } elseif ($_GET['error'] == '1') {
  41.         echo '<a href="?url=' . substr($_GET['url'], 0, -4) . '-0.txt&project_code=' . $_GET['project_code'] . '">Try next method 2</a>';
  42.     } else {
  43.         echo 'Please follow the native url from previous tab and find the solution manually';
  44.     }
  45. }
  46. ?>
  47.  
  48.  
  49.  
  50. -----------Cotent.txt------------
  51. Three Acres and Liberty. 1918. Project Gutenbert text #4509. Instructions for "living off the land" with a small farm in the U.S..
  52.  
  53. The Farm That Won't Wear Out. Project Gutenbert text #4525. Instructions for nutrient management in farm soils.
  54.  
  55. Dry-Farming: A System of Agriculture for Countries under Low Rainfall. 1920. Project Gutenbert text #4924.
  56.  
  57. Farmers of Forty Centuries, or: Permanent Agriculture in China, Korea and Japan. 1911. Project Gutenbert text #5350.
  58.  
  59. Home Vegetable Gardening: A Complete And Practical Guide To The Planting And Care Of All Vegetables, Fruits And Berries Worth Growing For Home Use. Project Gutenberg text #7123.
  60.  
  61. The Country Housewife and Lady's Director In the Management of a House, and the Delights and Profits of a Farm. Project Gutenberg text #7123. Mostly a recipe book, but with some great food processing etc. tips. [TODO: There is a second part. To be found.]
  62.  
  63. Common Diseases of Farm Animals. 1915. Project Gutenberg text #8502. Seems to be quite in-depth. Obviously, more modern material would be better.
  64.  
  65. Domestic Cookery, Useful Receipts, and Hints to Young Housekeepers. Project Gutenberg text #9101. Mostly recipes, but among them some great stuff ("Pickled walnuts will keep for six or seven years").
  66.  
  67. Manual of Gardening: A Practical Guide to the Making of Home Grounds and the Growing of Flowers, Fruits, and Vegetables for Home Use. 1910. Project Gutenberg text #9550.
  68.  
  69. Directions for Cookery, in its Various Branches. 1840. Project Gutenberg text #9624.
  70.  
  71. Woman's Institute Library of Cookery, Volume 1: Essentials of Cookery; Cereals; Bread; Hot Breads. Project Gutenberg text #9935.
  72.  
  73. Woman's Institute Library of Cookery, Volume 2: Milk, Butter and Cheese; Eggs; Vegetables. Project Gutenberg text #9936.
  74.  
  75. Woman's Institute Library of Cookery, Volume 3: Soup; Meat; Poultry and Game; Fish and Shell Fish. Project Gutenberg text #9937.
  76.  
  77. Woman's Institute Library of Cookery, Volume 4: Salads and Sandwiches; Cold and Frozen Desserts; Cakes, Cookies and Puddings; Pastries and Pies. Project Gutenberg text #9938.
  78.  
  79. Woman's Institute Library of Cookery, Volume 5: Fruit and Fruit Desserts; Canning and Drying; Jelly Making, Preserving and Pickling; Confections; Beverages; The Planning of Meals. Project Gutenberg text #9939.
  80.  
  81. English Housewifery Exemplified In above Four Hundred and Fifty Receipts Giving Directions for most Parts of Cookery. 1764. Project Gutenberg text #10072.
  82.  
  83. Reform Cookery Book: Up-To-Date Health Cookery for the Twentieth Century. Project Gutenberg text #11067.
  84.  
  85. Our Farm of Four Acres and the Money we Made by it. 1860. Project Gutenberg text #11555.
  86.  
  87. Science in the Kitchen. 1893. Project Gutenberg text #12238.
  88.  
  89. Old Cookery Books and Ancient Cuisine. 1902. Project Gutenberg text #12293.
  90.  
  91. Project Gutenberg text #12815.
  92.  
  93. Project Gutenberg text #13347.
  94.  
  95. Project Gutenberg text #13510.
  96.  
  97. Project Gutenberg text #13537.
  98.  
  99. Project Gutenberg text #13545.
  100.  
  101. Project Gutenberg text #13887.
  102.  
  103. Project Gutenberg text #13923.
  104.  
  105. Project Gutenberg text #14594.
  106.  
  107. Project Gutenberg text #16232.
  108.  
  109. Project Gutenberg text #16900.
  110.  
  111. Project Gutenberg text #18050.
  112.  
  113. Project Gutenberg text #19775.
  114.  
  115. Project Gutenberg text #19998.
  116.  
  117. Project Gutenberg text #20770.
  118.  
  119. Project Gutenberg text #21682.
  120.  
  121. Project Gutenberg text #21724.
  122.  
  123. Project Gutenberg text #22114.
  124.  
  125. . Project Gutenberg text #22484.
  126.  
  127. Project Gutenberg text #22790.
  128.  
  129. Project Gutenberg text #22829.
  130.  
  131. Project Gutenberg text #23435.
  132.  
  133. Project Gutenberg text #24076.
  134.  
  135. Project Gutenberg text #26132.
  136.  
  137. . Project Gutenberg text #26313.
  138.  
  139. Project Gutenberg text #26374.
  140.  
  141. Project Gutenberg text #26718.
  142.  
  143. Project Gutenberg text #26801.
  144.  
  145. Project Gutenberg text #27257.
  146.  
  147. Project Gutenberg text #28452.
  148.  
  149. Project Gutenberg text #28500.
  150.  
  151. Project Gutenberg text #29058.
  152.  
  153. Project Gutenberg text #29084.
  154.  
  155. Project Gutenberg text #30441.
  156.  
  157. Project Gutenberg text #30983.
  158.  
  159. Project Gutenberg text #31105.
  160.  
  161. Project Gutenberg text #31423.
  162.  
  163. Project Gutenberg text #31605.
  164.  
  165. Project Gutenberg text #31643.
  166.  
  167. Project Gutenberg text #31729.
  168.  
  169. Project Gutenberg text #32158.
  170.  
  171. Project Gutenberg text #32818.
  172.  
  173. Project Gutenberg text #33830.
  174.  
  175. Project Gutenberg text #33974.
  176.  
  177. Project Gutenberg text #34175.
  178.  
  179. Project Gutenberg text #34437.
  180.  
  181. Project Gutenberg text #35567.
  182.  
  183. Project Gutenberg text #35646.
  184.  
  185. Project Gutenberg text #36064.
  186.  
  187. Project Gutenberg text #36689.
  188.  
  189. Project Gutenberg text #37389.
  190.  
  191. Project Gutenberg text #38615.
  192.  
  193. Project Gutenberg text #39791.
  194.  
  195. Project Gutenberg text #39869.
  196.  
  197. Project Gutenberg text #40190.
  198.  
  199. Project Gutenberg text #40943.
  200.  
  201. . Project Gutenberg text #41352.
  202.  
  203. Project Gutenberg text #41406.
  204.  
  205. Project Gutenberg text #42544.
  206.  
  207. Project Gutenberg text #42718.
  208.  
  209. Project Gutenberg text #42955.
  210.  
  211. Project Gutenberg text #43177.
  212.  
  213. Project Gutenberg text #43468.
  214.  
  215. Project Gutenberg text #43531.
  216.  
  217. Project Gutenberg text #43867.
  218.  
  219. Project Gutenberg text #43867.
  220.  
  221. Project Gutenberg text #44603.
  222.  
  223. Project Gutenberg text #44732.
  224.  
  225. Project Gutenberg text #44750.
  226.  
  227. Project Gutenberg text #44766.
  228.  
  229. Project Gutenberg text #45004.
  230.  
  231. Project Gutenberg text #45083.
  232.  
  233. Project Gutenberg text #45154.
  234.  
  235. Project Gutenberg text #45331.
  236.  
  237. Project Gutenberg text #45703.
  238.  
  239. Project Gutenberg text #46052.
  240.  
  241. Project Gutenberg text #46144.
  242.  
  243. Project Gutenberg text #46377.
  244.  
  245. Project Gutenberg text #46445.
  246.  
  247. Project Gutenberg text #47444.
  248.  
  249. Project Gutenberg text #48134.
  250.  
  251. Project Gutenberg text #48360.
  252.  
  253. Project Gutenberg text #48378.
  254.  
  255. Project Gutenberg text #48546.
  256.  
  257. Project Gutenberg text #48547.
  258.  
  259. Project Gutenberg text #48722.
  260.  
  261. Project Gutenberg text #48753.
  262.  
  263. Project Gutenberg text #49155.
  264.  
  265. Project Gutenberg text #50079.
  266.  
  267. Project Gutenberg text #50191.
  268.  
  269. Project Gutenberg text #52551.
  270.  
  271. Project Gutenberg text #53458.
  272.  
  273. Project Gutenberg text #53525.
  274.  
  275. Project Gutenberg text #54138.
  276.  
  277. Project Gutenberg text #55314.
  278.  
  279. Project Gutenberg text #55555.
  280.  
  281. Project Gutenberg text #55705.
  282.  
  283. Project Gutenberg text #56585.
  284.  
  285. Project Gutenberg text #57340.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement