Advertisement
Guest User

Untitled

a guest
Jun 22nd, 2018
83
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 30.93 KB | None | 0 0
  1. Input Output
  2. 90 24
  3. 3000 430
  4. 9000 1117
  5. 4000000 283146 <--- input = 4*10^6
  6. 800000000 41146179 <--- input = 9*10^8
  7. 1100000000 55662470 <--- input = 1.1*10^9
  8.  
  9. Input Output
  10. 1907000000 93875448 <--- input = 1.907*10^9
  11. 1337000000 66990613 <--- input = 1.337*10^9
  12. 1240000000 62366021 <--- input = 1.24*10^9
  13. 660000000 34286170 <--- input = 6.6*10^8
  14. 99820000 5751639 <--- input = 9.982*10^7
  15. 40550000 2465109 <--- input = 4.055*10^7
  16. 24850000 1557132 <--- input = 2.485*10^7
  17. 41500 4339
  18.  
  19. #!/bin/bash
  20.  
  21. a=(1907000000 1337000000 1240000000 660000000 99820000 40550000 24850000 41500)
  22.  
  23. echo DennisC
  24. exec 2>> times/dennisc.txt
  25. time for j in ${a[@]}; do ./dennisc $j; done >> /dev/null;
  26.  
  27. echo DennisPy
  28. exec 2>> times/dennispy.txt
  29. time for j in ${a[@]}; do pypy dennispy.py <<< $j; done >> /dev/null;
  30.  
  31. echo arjandelumens
  32. exec 2>> times/arjandelumens.txt
  33. time for j in ${a[@]}; do ./arjandelumens $j; done >> /dev/null;
  34.  
  35. echo orlp
  36. exec 2>> times/orlp.txt
  37. time for j in ${a[@]}; do ./orlp $j; done >> /dev/null;
  38.  
  39. # echo mwr247
  40. # time node-v4.3.1-linux-x64/bin/node mwr247.js
  41.  
  42. # mwr247 using js seems a bit longer, so I am going to run the fastest
  43. # and then come back to his.
  44.  
  45. # mwr247 provided a function, so I appended
  46. # console.log( F( <argument> ) )
  47. # to his code, for each argument.
  48.  
  49. $ ./timepi '-march=native -O3' pi 1000
  50. pi(x) = 93875448 real time: 2774958 ns
  51. pi(x) = 66990613 real time: 2158491 ns
  52. pi(x) = 62366021 real time: 2023441 ns
  53. pi(x) = 34286170 real time: 1233158 ns
  54. pi(x) = 5751639 real time: 384284 ns
  55. pi(x) = 2465109 real time: 239783 ns
  56. pi(x) = 1557132 real time: 196248 ns
  57. pi(x) = 4339 real time: 60597 ns
  58.  
  59. 0.00572879 s
  60.  
  61. real 0m28.006s
  62. user 0m15.703s
  63. sys 0m14.319s
  64.  
  65. real 0m8.934s
  66. user 0m8.795s
  67. sys 0m0.150s
  68.  
  69. real 0m8.956s
  70. user 0m8.818s
  71. sys 0m0.150s
  72.  
  73. real 0m8.907s
  74. user 0m8.775s
  75. sys 0m0.145s
  76.  
  77. real 0m8.904s
  78. user 0m8.775s
  79. sys 0m0.141s
  80.  
  81. real 0m8.902s
  82. user 0m8.783s
  83. sys 0m0.132s
  84.  
  85. real 0m9.087s
  86. user 0m8.923s
  87. sys 0m0.176s
  88.  
  89. real 0m8.905s
  90. user 0m8.778s
  91. sys 0m0.140s
  92.  
  93. real 0m9.005s
  94. user 0m8.859s
  95. sys 0m0.158s
  96.  
  97. real 0m8.911s
  98. user 0m8.789s
  99. sys 0m0.135s
  100.  
  101. real 0m8.907s
  102. user 0m8.781s
  103. sys 0m0.138s
  104.  
  105. $ time for i in 1.907e9 1.337e9 1.24e9 6.6e8 9.982e7 4.055e7 2.485e7 41500
  106. > do pypy pi.py <<< $i; done
  107. 93875448
  108. 66990613
  109. 62366021
  110. 34286170
  111. 5751639
  112. 2465109
  113. 1557132
  114. 4339
  115.  
  116. real 0m1.696s
  117. user 0m1.360s
  118. sys 0m0.332s
  119.  
  120. real 3m56.961s
  121. user 3m38.802s
  122. sys 0m18.512s
  123.  
  124. 0000000: cafe babe 0000 0034 0030 0a00 0900 1709 .......4.0......
  125. 0000010: 0018 0019 0a00 1a00 1b0a 0008 001c 0a00 ................
  126. 0000020: 1d00 1e0a 0008 001f 0a00 2000 2107 0022 .......... .!.."
  127. 0000030: 0700 2301 0006 3c69 6e69 743e 0100 0328 ..#...<init>...(
  128. 0000040: 2956 0100 0443 6f64 6501 000f 4c69 6e65 )V...Code...Line
  129. 0000050: 4e75 6d62 6572 5461 626c 6501 0004 6d61 NumberTable...ma
  130. 0000060: 696e 0100 1628 5b4c 6a61 7661 2f6c 616e in...([Ljava/lan
  131. 0000070: 672f 5374 7269 6e67 3b29 5601 0008 6e75 g/String;)V...nu
  132. 0000080: 6d50 7269 6d65 0100 0428 4929 4901 000d mPrime...(I)I...
  133. 0000090: 5374 6163 6b4d 6170 5461 626c 6501 0007 StackMapTable...
  134. 00000a0: 6973 5072 696d 6501 0004 2849 295a 0100 isPrime...(I)Z..
  135. 00000b0: 0a53 6f75 7263 6546 696c 6501 0006 452e .SourceFile...E.
  136. 00000c0: 6a61 7661 0c00 0a00 0b07 0024 0c00 2500 java.......$..%.
  137. 00000d0: 2607 0027 0c00 2800 290c 0010 0011 0700 &..'..(.).......
  138. 00000e0: 2a0c 002b 002c 0c00 1300 1407 002d 0c00 *..+.,.......-..
  139. 00000f0: 2e00 2f01 0001 4501 0010 6a61 7661 2f6c ../...E...java/l
  140. 0000100: 616e 672f 4f62 6a65 6374 0100 106a 6176 ang/Object...jav
  141. 0000110: 612f 6c61 6e67 2f53 7973 7465 6d01 0003 a/lang/System...
  142. 0000120: 6f75 7401 0015 4c6a 6176 612f 696f 2f50 out...Ljava/io/P
  143. 0000130: 7269 6e74 5374 7265 616d 3b01 0011 6a61 rintStream;...ja
  144. 0000140: 7661 2f6c 616e 672f 496e 7465 6765 7201 va/lang/Integer.
  145. 0000150: 0008 7061 7273 6549 6e74 0100 1528 4c6a ..parseInt...(Lj
  146. 0000160: 6176 612f 6c61 6e67 2f53 7472 696e 673b ava/lang/String;
  147. 0000170: 2949 0100 136a 6176 612f 696f 2f50 7269 )I...java/io/Pri
  148. 0000180: 6e74 5374 7265 616d 0100 0770 7269 6e74 ntStream...print
  149. 0000190: 6c6e 0100 0428 4929 5601 000e 6a61 7661 ln...(I)V...java
  150. 00001a0: 2f6c 616e 672f 4d61 7468 0100 0473 7172 /lang/Math...sqr
  151. 00001b0: 7401 0004 2844 2944 0021 0008 0009 0000 t...(D)D.!......
  152. 00001c0: 0000 0004 0001 000a 000b 0001 000c 0000 ................
  153. 00001d0: 001d 0001 0001 0000 0005 2ab7 0001 b100 ..........*.....
  154. 00001e0: 0000 0100 0d00 0000 0600 0100 0000 0100 ................
  155. 00001f0: 0900 0e00 0f00 0100 0c00 0000 2c00 0300 ............,...
  156. 0000200: 0100 0000 10b2 0002 2a03 32b8 0003 b800 ........*.2.....
  157. 0000210: 04b6 0005 b100 0000 0100 0d00 0000 0a00 ................
  158. 0000220: 0200 0000 0300 0f00 0400 0a00 1000 1100 ................
  159. 0000230: 0100 0c00 0000 6600 0200 0300 0000 2003 ......f....... .
  160. 0000240: 3c03 3d1c 1aa2 0018 1b1c b800 0699 0007 <.=.............
  161. 0000250: 04a7 0004 0360 3c84 0201 a7ff e91b ac00 .....`<.........
  162. 0000260: 0000 0200 0d00 0000 1600 0500 0000 0600 ................
  163. 0000270: 0200 0700 0900 0800 1800 0700 1e00 0900 ................
  164. 0000280: 1200 0000 1800 04fd 0004 0101 5001 ff00 ............P...
  165. 0000290: 0000 0301 0101 0002 0101 fa00 0700 0a00 ................
  166. 00002a0: 1300 1400 0100 0c00 0000 9700 0300 0300 ................
  167. 00002b0: 0000 4c1a 05a2 0005 03ac 1a05 9f00 081a ..L.............
  168. 00002c0: 06a0 0005 04ac 1a05 7099 0009 1a06 709a ........p.....p.
  169. 00002d0: 0005 03ac 1a87 b800 078e 0460 3c10 063d ...........`<..=
  170. 00002e0: 1c1b a300 1b1a 1c04 6470 9900 0b1a 1c04 ........dp......
  171. 00002f0: 6070 9a00 0503 ac84 0206 a7ff e604 ac00 `p..............
  172. 0000300: 0000 0200 0d00 0000 2200 0800 0000 0c00 ........".......
  173. 0000310: 0700 0d00 1300 0e00 2100 0f00 2a00 1000 ........!...*...
  174. 0000320: 3200 1100 4400 1000 4a00 1200 1200 0000 2...D...J.......
  175. 0000330: 1100 0907 0901 0b01 fd00 0b01 0114 01fa ................
  176. 0000340: 0005 0001 0015 0000 0002 0016 ............
  177.  
  178. > time java E 41500;time java E 24850000;time java E 40550000;time java E 99820000;time java E 660000000;time java E 1240000000;time java E 1337000000;time java E 1907000000
  179. 4339
  180.  
  181. real 0m0.236s
  182. user 0m0.112s
  183. sys 0m0.024s
  184. 1557132
  185.  
  186. real 0m8.842s
  187. user 0m8.784s
  188. sys 0m0.060s
  189. 2465109
  190.  
  191. real 0m18.442s
  192. user 0m18.348s
  193. sys 0m0.116s
  194. 5751639
  195.  
  196. real 1m15.642s
  197. user 1m8.772s
  198. sys 0m0.252s
  199. 34286170
  200.  
  201. real 40m35.810s
  202. user 16m5.240s
  203. sys 0m5.820s
  204. 62366021
  205.  
  206. real 104m12.628s
  207. user 39m32.348s
  208. sys 0m13.584s
  209. 66990613
  210.  
  211. real 110m22.064s
  212. user 42m28.092s
  213. sys 0m11.320s
  214. 93875448
  215.  
  216. real 171m51.650s
  217. user 68m39.968s
  218. sys 0m14.916s
  219.  
  220. use std::env;
  221.  
  222. const EMPTY: usize = std::usize::MAX;
  223. const MAX_X: usize = 800;
  224.  
  225. fn main() {
  226. let args: Vec<_> = env::args().collect();
  227. let x: usize = args[1].trim().parse().expect("expected a number");
  228.  
  229. let root = (x as f64).sqrt() as usize;
  230. let y = (x as f64).powf(0.3333333333333) as usize + 1;
  231.  
  232. let sieve_size = x / y + 2;
  233. let mut sieve = vec![true; sieve_size];
  234. let mut primes = vec![0; sieve_size];
  235. sieve[0] = false;
  236. sieve[1] = false;
  237.  
  238. let mut a = 0;
  239. let mut num_primes = 1;
  240.  
  241. let mut num_primes_smaller_root = 0;
  242.  
  243. // find all primes up to x/y ~ x^2/3 aka sieve_size
  244. for i in 2..sieve_size {
  245. if sieve[i] {
  246. if i <= root {
  247. if i <= y {
  248. a += 1;
  249. }
  250. num_primes_smaller_root += 1;
  251. }
  252.  
  253. primes[num_primes] = i;
  254. num_primes += 1;
  255. let mut multiples = i;
  256. while multiples < sieve_size {
  257. sieve[multiples] = false;
  258. multiples += i;
  259. }
  260. }
  261. }
  262.  
  263. let interesting_primes = primes[a + 1..num_primes_smaller_root + 1].iter();
  264.  
  265. let p_2 =
  266. interesting_primes
  267. .map(|ip| primes.iter().take_while(|&&p| p <= x / ip).count())
  268. .enumerate()
  269. .map(|(i, v)| v - 1 - i - a)
  270. .fold(0, |acc, v| acc + v);
  271.  
  272. let mut phi_results = vec![EMPTY; (a + 1) * MAX_X];
  273. println!("pi({}) = {}", x, phi(x, a, &primes, &mut phi_results) + a - 1 - p_2);
  274. }
  275.  
  276. fn phi(x: usize, b: usize, primes: &[usize], phi_results: &mut [usize]) -> usize {
  277. if b == 0 {
  278. return x;
  279. }
  280.  
  281. if x < MAX_X && phi_results[x + b * MAX_X] != EMPTY {
  282. return phi_results[x + b * MAX_X];
  283. }
  284.  
  285. let value = phi(x, b - 1, primes, phi_results) - phi(x / primes[b], b - 1, primes, phi_results);
  286. if x < MAX_X {
  287. phi_results[x + b * MAX_X] = value;
  288. }
  289. value
  290. }
  291.  
  292. #!/bin/bash
  293.  
  294. a=(1907000000 1337000000 1240000000 660000000 99820000 40550000 24850000 41500)
  295.  
  296. for i in {0..100}; do
  297. for j in ${a[@]}; do
  298. ./target/release/pi_n $j > /dev/null;
  299. done;
  300. done;
  301.  
  302. [me@localhost pi_n]$ time ./time.sh
  303.  
  304. real 0m37.011s
  305. user 0m34.752s
  306. sys 0m2.410s
  307.  
  308. #include <cmath>
  309. #include <cstdint>
  310. #include <iostream>
  311. #include <vector>
  312. #include <boost/dynamic_bitset.hpp>
  313.  
  314. uint64_t num_primes(uint64_t n) {
  315. // http://stackoverflow.com/questions/4643647/fast-prime-factorization-module
  316. uint64_t pi = (n >= 2) + (n >= 3);
  317. if (n < 5) return pi;
  318.  
  319. n += 1;
  320. uint64_t correction = n % 6 > 1;
  321. uint64_t wheels[6] = { n, n - 1, n + 4, n + 3, n + 2, n + 1 };
  322. uint64_t limit = wheels[n % 6];
  323.  
  324. boost::dynamic_bitset<> sieve(limit / 3);
  325. sieve.set();
  326. sieve[0] = false;
  327.  
  328. for (uint64_t i = 0, upper = uint64_t(std::sqrt(limit))/3; i <= upper; ++i) {
  329. if (sieve[i]) {
  330. uint64_t k = (3*i + 1) | 1;
  331. for (uint64_t j = (k*k) / 3; j < limit/3; j += 2*k) sieve[j] = false;
  332. for (uint64_t j = (k*k + 4*k - 2*k*(i & 1))/3; j < limit/3; j += 2*k) sieve[j] = false;
  333. }
  334. }
  335.  
  336. pi += sieve.count();
  337. for (uint64_t i = limit / 3 - correction; i < limit / 3; ++i) pi -= sieve[i];
  338.  
  339. return pi;
  340. }
  341.  
  342.  
  343. int main(int argc, char** argv) {
  344. if (argc <= 1) {
  345. std::cout << "Usage: " << argv[0] << " nn";
  346. return 0;
  347. }
  348.  
  349. std::cout << num_primes(std::stoi(argv[1])) << "n";
  350. return 0;
  351. }
  352.  
  353. real 0m22.760s
  354. user 0m22.704s
  355. sys 0m0.080s
  356.  
  357. real 0m22.854s
  358. user 0m22.800s
  359. sys 0m0.077s
  360.  
  361. real 0m22.742s
  362. user 0m22.700s
  363. sys 0m0.066s
  364.  
  365. real 0m22.484s
  366. user 0m22.450s
  367. sys 0m0.059s
  368.  
  369. real 0m22.653s
  370. user 0m22.597s
  371. sys 0m0.080s
  372.  
  373. real 0m22.665s
  374. user 0m22.602s
  375. sys 0m0.088s
  376.  
  377. real 0m22.528s
  378. user 0m22.489s
  379. sys 0m0.062s
  380.  
  381. real 0m22.510s
  382. user 0m22.474s
  383. sys 0m0.060s
  384.  
  385. real 0m22.819s
  386. user 0m22.759s
  387. sys 0m0.084s
  388.  
  389. real 0m22.488s
  390. user 0m22.459s
  391. sys 0m0.053s
  392.  
  393. 10^4 (10,000) => 0.001 seconds
  394. 10^5 (100,000) => 0.003 seconds
  395. 10^6 (1,000,000) => 0.009 seconds
  396. 10^7 (10,000,000) => 0.074 seconds
  397. 10^8 (100,000,000) => 1.193 seconds
  398. 10^9 (1,000,000,000) => 14.415 seconds
  399.  
  400. #include <cstdint>
  401. #include <vector>
  402. #include <iostream>
  403. #include <limits>
  404. #include <cmath>
  405. #include <array>
  406. // uses posix ffsll
  407. #include <string.h>
  408. #include <algorithm>
  409. #include <thread>
  410.  
  411. constexpr uint64_t wheel_width = 2;
  412. constexpr uint64_t buf_size = 1<<(10+6);
  413. constexpr uint64_t dtype_width = 6;
  414. constexpr uint64_t dtype_mask = 63;
  415. constexpr uint64_t buf_len = ((buf_size*wheel_width)>>dtype_width);
  416. constexpr uint64_t seg_len = 6*buf_size;
  417. constexpr uint64_t limit_i_max = 0xfffffffe00000001ULL;
  418.  
  419. typedef std::vector<uint64_t> buf_type;
  420.  
  421. void mark_composite(buf_type& buf, uint64_t prime,
  422. std::array<uint64_t, 2>& poff,
  423. uint64_t seg_start, uint64_t max_j)
  424. {
  425. const auto p = 2*prime;
  426. for(uint64_t k = 0; k < wheel_width; ++k)
  427. {
  428. for(uint64_t j = 2*poff[k]+(k==0); j < max_j; j += p)
  429. {
  430. buf[(j-seg_start)>>dtype_width] |= 1ULL << (j & dtype_mask);
  431. poff[k] += prime;
  432. }
  433. }
  434. }
  435.  
  436. struct prime_counter
  437. {
  438. buf_type buf;
  439. uint64_t n;
  440. uint64_t seg_a, seg_b;
  441. uint64_t nj;
  442. uint64_t store_max;
  443. uint64_t& store_res;
  444.  
  445. prime_counter(uint64_t n, uint64_t seg_a, uint64_t seg_b, uint64_t nj, uint64_t store_max,
  446. uint64_t& store_res) :
  447. buf(buf_len), n(n), nj(nj), seg_a(seg_a), seg_b(seg_b),
  448. store_max(store_max), store_res(store_res)
  449. {}
  450.  
  451. prime_counter(const prime_counter&) = default;
  452. prime_counter(prime_counter&&) = default;
  453.  
  454. prime_counter& operator =(const prime_counter&) = default;
  455. prime_counter& operator =(prime_counter&&) = default;
  456.  
  457. void operator()(uint64_t nsmall_segs,
  458. const std::vector<uint64_t>& primes,
  459. std::vector<std::array<uint64_t, 2> > poffs)
  460. {
  461. uint64_t res = 0;
  462. // no new prime added portion
  463. uint64_t seg_start = buf_size*wheel_width*seg_a;
  464. uint64_t seg_min = seg_len*seg_a+5;
  465.  
  466. if(seg_a > nsmall_segs)
  467. {
  468. uint64_t max_j = buf_size*wheel_width*nsmall_segs+(seg_a-nsmall_segs)*(buf_len<<dtype_width);
  469. for(size_t k = 0; k < wheel_width; ++k)
  470. {
  471. for(uint64_t i = 0; i < poffs.size() && max_j >= (2*poffs[i][k]+(k==0)); ++i)
  472. {
  473. // adjust poffs
  474. // TODO: might be a more efficient way
  475. auto w = (max_j-(2*poffs[i][k]+(k==0)));
  476. poffs[i][k] += primes[i]*(w/(2*primes[i]));
  477. if(w % (2*primes[i]) != 0)
  478. {
  479. poffs[i][k]+=primes[i];// += primes[i]*(w/(2*primes[i])+1);
  480. }
  481. /*else
  482. {
  483.  
  484. }*/
  485. }
  486. }
  487. }
  488.  
  489. for(uint64_t seg = seg_a; seg < seg_b; ++seg)
  490. {
  491. std::fill(buf.begin(), buf.end(), 0);
  492. const uint64_t limit_i = std::min<uint64_t>((((seg_len+seg_min) >= limit_i_max) ?
  493. std::numeric_limits<uint32_t>::max() :
  494. ceil(sqrt(seg_len+seg_min))),
  495. store_max);
  496. uint64_t max_j = std::min(seg_start+(buf_len<<dtype_width), nj);
  497. for(uint64_t i = 0; i < primes.size() && primes[i] <= limit_i; ++i)
  498. {
  499. mark_composite(buf, primes[i], poffs[i], seg_start, max_j);
  500. }
  501. // sieve
  502. uint64_t val;
  503. const uint64_t stop = std::min(seg_min+seg_len, n);
  504. for(uint64_t i = ffsll(~(buf[0]))-((~buf[0]) != 0)+64*((~buf[0]) == 0);
  505. (val = 6ULL*(i>>1)+seg_min+2ULL*(i&1ULL)) < stop;)
  506. {
  507. if(!(buf[i>>dtype_width] & (1ULL << (i & dtype_mask))))
  508. {
  509. ++res;
  510. ++i;
  511. }
  512. else
  513. {
  514. uint64_t mask = buf[i>>dtype_width]>>(i&dtype_mask);
  515. const int64_t inc = ffsll(~mask)-((~mask) != 0)+64*((~mask) == 0);
  516. i += inc;
  517. }
  518. }
  519. seg_min += seg_len;
  520. seg_start += buf_size*wheel_width;
  521. }
  522. store_res = res;
  523. }
  524. };
  525.  
  526. uint64_t num_primes(uint64_t n)
  527. {
  528. uint64_t res = (n >= 2) + (n >= 3);
  529. if(n >= 5)
  530. {
  531. buf_type buf(buf_len);
  532. // compute and store primes < sqrt(n)
  533. const uint64_t store_max = ceil(sqrt(n));
  534.  
  535. // only primes >= 5
  536. std::vector<uint64_t> primes;
  537. std::vector<std::array<uint64_t, 2> > poffs;
  538. primes.reserve(ceil(1.25506*store_max/log(store_max)));
  539. poffs.reserve(ceil(1.25506*store_max/log(store_max)));
  540. uint64_t seg_start = 0;
  541. uint64_t seg_min = 5;
  542. const uint64_t num_segs = 1+(n-seg_min)/seg_len;
  543. const uint64_t nj = (n-seg_min)/3+1;
  544. // compute how many small segments there are
  545. const uint64_t nsmall_segs = 1+(store_max-seg_min)/seg_len;
  546. for(uint64_t seg = 0; seg < nsmall_segs; ++seg)
  547. {
  548. std::fill(buf.begin(), buf.end(), 0);
  549. // mark off small primes
  550. const uint64_t limit_i = std::min<uint64_t>((((seg_len+seg_min) >= limit_i_max) ?
  551. std::numeric_limits<uint32_t>::max() :
  552. ceil(sqrt(seg_len+seg_min))),
  553. store_max);
  554. uint64_t max_j = std::min(seg_start+(buf_len<<dtype_width), nj);
  555. for(uint64_t i = 0; i < primes.size() && primes[i] <= limit_i; ++i)
  556. {
  557. mark_composite(buf, primes[i], poffs[i], seg_start, max_j);
  558. }
  559. // sieve
  560. uint64_t val;
  561. const uint64_t stop = std::min(seg_min+seg_len, n);
  562. for(uint64_t i = ffsll(~(buf[0]))-((~buf[0]) != 0)+64*((~buf[0]) == 0);
  563. (val = 6ULL*(i>>1)+seg_min+2ULL*(i&1ULL)) < stop;)
  564. {
  565. if(!(buf[i>>dtype_width] & (1ULL << (i & dtype_mask))))
  566. {
  567. if(val <= store_max)
  568. {
  569. // add prime and poffs
  570. primes.push_back(val);
  571. poffs.emplace_back();
  572. poffs.back()[0] = (val*val-1)/6-1;
  573. if(i&1)
  574. {
  575. // 6n+1 prime
  576. poffs.back()[1] = (val*val+4*val-5)/6;
  577. }
  578. else
  579. {
  580. // 6n+5 prime
  581. poffs.back()[1] = (val*val+2*val-5)/6;
  582. }
  583. // mark-off multiples
  584. mark_composite(buf, val, poffs.back(), seg_start, max_j);
  585. }
  586. ++res;
  587. ++i;
  588. }
  589. else
  590. {
  591. uint64_t mask = buf[i>>dtype_width]>>(i&dtype_mask);
  592. const int64_t inc = ffsll(~mask)-((~mask) != 0)+64*((~mask) == 0);
  593. i += inc;
  594. }
  595. }
  596. seg_min += seg_len;
  597. seg_start += buf_size*wheel_width;
  598. }
  599. // multi-threaded sieving for remaining segments
  600. std::vector<std::thread> workers;
  601. auto num_workers = std::min<uint64_t>(num_segs-nsmall_segs, std::thread::hardware_concurrency());
  602. std::vector<uint64_t> store_reses(num_workers);
  603.  
  604. workers.reserve(num_workers);
  605. auto num_segs_pw = ceil((num_segs-nsmall_segs)/static_cast<double>(num_workers));
  606. for(size_t i = 0; i < num_workers; ++i)
  607. {
  608. workers.emplace_back(prime_counter(n, nsmall_segs+i*num_segs_pw,
  609. std::min<uint64_t>(nsmall_segs+(i+1)*num_segs_pw,
  610. num_segs),
  611. nj, store_max, store_reses[i]),
  612. nsmall_segs, primes, poffs);
  613. }
  614. for(size_t i = 0; i < num_workers; ++i)
  615. {
  616. workers[i].join();
  617. res += store_reses[i];
  618. }
  619. }
  620. return res;
  621. }
  622.  
  623. int main(int argc, char** argv)
  624. {
  625. if(argc <= 1)
  626. {
  627. std::cout << "usage: " << argv[0] << " nn";
  628. return -1;
  629. }
  630. std::cout << num_primes(std::stoll(argv[1])) << 'n';
  631. }
  632.  
  633. g++ -std=c++11 -o sieve_mt -O3 -march=native -pthread sieve_mt.cpp
  634.  
  635. real 4m7.215s
  636. user 23m54.086s
  637. sys 0m1.239s
  638.  
  639. #include <stdlib.h>
  640. #include <stdio.h>
  641. #include <string.h>
  642.  
  643. unsigned char p[2000000001];
  644.  
  645. int main(int argc, char **argv)
  646. {
  647. unsigned int n, c, i, j;
  648.  
  649. n = atoi(argv[1]);
  650. memset(p, 1, n + 1);
  651.  
  652. p[1] = p[0] = 0;
  653.  
  654. for (i = 2, c = 0; i <= n; i++)
  655. {
  656. if (p[i])
  657. {
  658. c++;
  659. for (j = i + i; j <= n; j += i)
  660. p[j] = 0;
  661. }
  662. }
  663.  
  664. printf("%d: %dn", n, c);
  665.  
  666. return 0;
  667. }
  668.  
  669. real 2m42.657s
  670. user 2m42.065s
  671. sys 0m0.757s
  672.  
  673. real 2m42.947s
  674. user 2m42.400s
  675. sys 0m0.708s
  676.  
  677. real 2m42.827s
  678. user 2m42.282s
  679. sys 0m0.703s
  680.  
  681. real 2m42.800s
  682. user 2m42.300s
  683. sys 0m0.665s
  684.  
  685. real 2m42.562s
  686. user 2m42.050s
  687. sys 0m0.675s
  688.  
  689. real 2m42.788s
  690. user 2m42.192s
  691. sys 0m0.756s
  692.  
  693. real 2m42.631s
  694. user 2m42.074s
  695. sys 0m0.720s
  696.  
  697. real 2m42.658s
  698. user 2m42.115s
  699. sys 0m0.707s
  700.  
  701. real 2m42.710s
  702. user 2m42.219s
  703. sys 0m0.657s
  704.  
  705. real 2m42.674s
  706. user 2m42.110s
  707. sys 0m0.730s
  708.  
  709. real 1m21.227s
  710. user 1m20.755s
  711. sys 0m0.576s
  712.  
  713. real 1m20.944s
  714. user 1m20.426s
  715. sys 0m0.640s
  716.  
  717. real 1m21.052s
  718. user 1m20.581s
  719. sys 0m0.573s
  720.  
  721. real 1m21.328s
  722. user 1m20.862s
  723. sys 0m0.570s
  724.  
  725. real 1m21.253s
  726. user 1m20.780s
  727. sys 0m0.588s
  728.  
  729. real 1m20.925s
  730. user 1m20.460s
  731. sys 0m0.576s
  732.  
  733. real 1m21.011s
  734. user 1m20.512s
  735. sys 0m0.601s
  736.  
  737. real 1m21.011s
  738. user 1m20.550s
  739. sys 0m0.564s
  740.  
  741. real 1m20.875s
  742. user 1m20.409s
  743. sys 0m0.569s
  744.  
  745. real 1m21.703s
  746. user 1m21.088s
  747. sys 0m0.701s
  748.  
  749. public class PrimeCounter
  750. {
  751. public static final String START_CODE="=",
  752. TEST_FORMAT="Input = %d , Output = %d , calculated in %f seconds%n",
  753. PROMPT="Enter numbers to compute pi(x) for (Type ""+START_CODE+"" to start):%n",
  754. WAIT="Calculating, please wait...%n",
  755. WARNING="Probably won't work with values close to or more than 2^31%n",
  756. TOTAL_OUTPUT_FORMAT="Total time for all inputs is %f seconds%n";
  757. public static final int NUM_THREADS=16,LOW_LIM=1,HIGH_LIM=1<<28;
  758. private static final Object LOCK=new Lock();
  759. private static final class Lock{}
  760. /**
  761. * Generates and counts primes using an optimized but naive iterative algorithm.
  762. * Uses MultiThreading for arguments above LOW_LIM
  763. * @param MAX : argument x for pi(x), the limit to which to generate numbers.
  764. */
  765. public static long primeCount(long MAX){
  766. long ctr=1;
  767. if(MAX<1<<7){
  768. for(long i=3;i<=MAX;i+=2){
  769. if(isPrime(i))++ctr;
  770. }
  771. }else{
  772. long[] counts=new long[NUM_THREADS];
  773. for(int i=0;i<NUM_THREADS;++i){
  774. counts[i]=-1;
  775. }
  776. long range=Math.round((double)MAX/NUM_THREADS);
  777. for(int i=0;i<NUM_THREADS;++i){
  778. long start=(i==0)?3:i*range+1,end=(i==NUM_THREADS-1)?MAX:(i+1)*range;
  779. final int idx=i;
  780. new Thread(new Runnable(){
  781. public void run(){
  782. for(long j=start;j<=end;j+=2){
  783. if(isPrime(j))++counts[idx];
  784. }
  785. }
  786. }).start();
  787. }
  788. synchronized(LOCK){
  789. while(!completed(counts)){
  790. try{
  791. LOCK.wait(300);}catch(InterruptedException ie){}
  792. }
  793. LOCK.notifyAll();
  794. }
  795. for(long count:counts){
  796. ctr+=count;
  797. }
  798. ctr+=NUM_THREADS;
  799. }
  800. return ctr;
  801. }
  802.  
  803. /**
  804. * Checks for completion of threads
  805. * @param array : The array containing the completion data
  806. */
  807. private static boolean completed(long[] array){
  808. for(long i:array){
  809. if(i<0)return false;
  810. }return true;
  811. }
  812.  
  813. /**
  814. * Checks if the parameter is prime or not.
  815. * 2,3,5,7 are hardcoded as factors.
  816. * @param n : the number to check for primality
  817. */
  818. private static boolean isPrime(long n){
  819. if(n==2||n==3||n==5||n==7)return true;
  820. else if(n%2==0||n%3==0||n%5==0||n%7==0)return false;
  821. else{
  822. for(long i=11;i<n;i+=2){
  823. if(n%i==0)return false;
  824. }
  825. return true;
  826. }
  827. }
  828.  
  829. /**
  830. * Calculates primes using the atandard Sieve of Eratosthenes.
  831. * Uses 2,3,5,7 wheel factorization for elimination (hardcoded for performance reasons)
  832. * @param MAX : argument x for pi(x)
  833. * Will delegate to <code>primeCount(long)</code> for MAX<LOW_LIM and to <code>bitPrimeSieve(long)</code>
  834. * for MAX>HIGH_LIM, for performance reasons.
  835. */
  836. public static long primeSieve(long MAX){
  837. if(MAX<=1)return 0;
  838. else if(LOW_LIM>0&&MAX<LOW_LIM){return primeCount(MAX);}
  839. else if(HIGH_LIM>0&&MAX>HIGH_LIM){return bitPrimeSieve(MAX);}
  840. int n=(int)MAX;
  841. int sn=(int)Math.sqrt(n),ctr=2;
  842. if(sn%2==0)--sn;
  843. boolean[]ps=new boolean[n+1];
  844. for(int i=2;i<=n;++i){
  845. if(i==2||i==3||i==5||i==7)ps[i]=true;
  846. else if(i%2!=0&&i%3!=0&&i%5!=0&&i%7!=0)ps[i]=true;
  847. else ++ctr;
  848. }
  849. for(int i=(n>10)?11:3;i<=sn;i+=2){
  850. if(ps[i]){
  851. for(int j=i*i;j<=n;j+=i){
  852. if(ps[j]){ ps[j]=false;++ctr;}
  853. }
  854. }
  855. }
  856. return (n+1-ctr);
  857. }
  858. /**
  859. * Calculates primes using bitmasked Sieve of Eratosthenes.
  860. * @param MAX : argument x for pi(x)
  861. */
  862. public static long bitPrimeSieve(long MAX) {
  863. long SQRT_MAX = (long) Math.sqrt(MAX);
  864. if(SQRT_MAX%2==0)--SQRT_MAX;
  865. int MEMORY_SIZE = (int) ((MAX+1) >> 4);
  866. byte[] array = new byte[MEMORY_SIZE];
  867. for (long i = 3; i <= SQRT_MAX; i += 2) {
  868. if ((array[(int) (i >> 4)] & (byte) (1 << ((i >> 1) & 7))) == 0) {
  869. for(long j=i*i;j<=MAX;j+=i<<1) {
  870. if((array[(int) (j >> 4)] & (byte) (1 << ((j >> 1) & 7))) == 0){
  871. array[(int) (j >> 4)] |= (byte) (1 << ((j >> 1) & 7));
  872. }
  873. }
  874. }
  875. }
  876. long pi = 1;
  877. for (long i = 3; i <= MAX; i += 2) {
  878. if ((array[(int) (i >> 4)] & (byte) (1 << ((i >> 1) & 7))) == 0) {
  879. ++pi;
  880. }
  881. }
  882. return pi;
  883. }
  884. /**
  885. * Private testing and timer function
  886. * @param MAX : input to be passed on to <code>primeSieve(long)</code>
  887. */
  888. private static long sieveTest(long MAX){
  889. long start=System.nanoTime();
  890. long ps=primeSieve(MAX);
  891. long end=System.nanoTime();
  892. System.out.format(TEST_FORMAT,MAX,ps,((end-start)/1E9));
  893. return end-start;
  894. }
  895. /**
  896. * Main method: accepts user input and shows total execution time taken
  897. * @param args : The command-line arguments
  898. */
  899. public static void main(String[]args){
  900. double total_time=0;
  901. java.util.Scanner sc=new java.util.Scanner(System.in);
  902. java.util.ArrayList<Long> numbers=new java.util.ArrayList<>();
  903. System.out.format(PROMPT+WARNING);
  904. String line=sc.nextLine();
  905. while(!line.equals(START_CODE)/*sc.hasNextLine()&&Character.isDigit(line.charAt(0))*/){
  906. numbers.add(Long.valueOf(line));
  907. line=sc.nextLine();
  908. }
  909. System.out.format(WAIT);
  910. for(long num:numbers){
  911. total_time+=sieveTest(num);
  912. }
  913. System.out.format(TOTAL_OUTPUT_FORMAT,total_time/1e9);
  914. }
  915. }
  916.  
  917. javac PrimeCounter.java
  918. java PrimeCounter
  919.  
  920. Enter numbers to compute pi(x) for (Type "=" to start):
  921. Probably won't work with values close to or more than 2^31
  922. 41500
  923. 24850000
  924. 40550000
  925. 99820000
  926. 660000000
  927. 1240000000
  928. 1337000000
  929. 1907000000
  930. =
  931. Calculating, please wait...
  932. Input = 41500 , Output = 4339 , calculated in 0.002712 seconds
  933. Input = 24850000 , Output = 1557132 , calculated in 0.304792 seconds
  934. Input = 40550000 , Output = 2465109 , calculated in 0.523999 seconds
  935. Input = 99820000 , Output = 5751639 , calculated in 1.326542 seconds
  936. Input = 660000000 , Output = 34286170 , calculated in 4.750049 seconds
  937. Input = 1240000000 , Output = 62366021 , calculated in 9.160406 seconds
  938. Input = 1337000000 , Output = 66990613 , calculated in 9.989093 seconds
  939. Input = 1907000000 , Output = 93875448 , calculated in 14.832107 seconds
  940. Total time for all inputs is 40.889700 seconds
  941.  
  942. import sys
  943.  
  944. sys.setrecursionlimit(sys.maxsize)
  945.  
  946. n = int(sys.argv[-1])
  947.  
  948. if n < 4:
  949. print(0 if n < 2 else n-1)
  950. exit()
  951.  
  952. p = [0, 0] + [True] * n
  953.  
  954. i = 0
  955. while i < pow(n, 0.5):
  956. if p[i]:
  957. j = pow(i, 2)
  958. while j < n:
  959. p[j] = False
  960. j += i
  961. i += 1
  962.  
  963. print(sum(p) - 2)
  964.  
  965. $ time python3 test.py 90
  966. 24
  967.  
  968. real 0m0.045s
  969. user 0m0.031s
  970. sys 0m0.010s
  971.  
  972. #include <cstdint>
  973. #include <vector>
  974. #include <iostream>
  975. #include <limits>
  976. #include <cmath>
  977. #include <array>
  978. // uses posix ffsll
  979. #include <string.h>
  980. #include <algorithm>
  981.  
  982. constexpr uint64_t wheel_width = 2;
  983. constexpr uint64_t buf_size = 1<<(10+6);
  984. constexpr uint64_t dtype_width = 6;
  985. constexpr uint64_t dtype_mask = 63;
  986. constexpr uint64_t buf_len = ((buf_size*wheel_width)>>dtype_width);
  987.  
  988. typedef std::vector<uint64_t> buf_type;
  989.  
  990. void mark_composite(buf_type& buf, uint64_t prime,
  991. std::array<uint64_t, 2>& poff,
  992. uint64_t seg_start, uint64_t max_j)
  993. {
  994. const auto p = 2*prime;
  995. for(uint64_t k = 0; k < wheel_width; ++k)
  996. {
  997. for(uint64_t j = 2*poff[k]+(k==0); j < max_j; j += p)
  998. {
  999. buf[(j-seg_start)>>dtype_width] |= 1ULL << (j & dtype_mask);
  1000. poff[k] += prime;
  1001. }
  1002. }
  1003. }
  1004.  
  1005. uint64_t num_primes(uint64_t n)
  1006. {
  1007. uint64_t res = (n >= 2) + (n >= 3);
  1008. if(n >= 5)
  1009. {
  1010. buf_type buf(buf_len);
  1011. // compute and store primes < sqrt(n)
  1012. const uint64_t store_max = ceil(sqrt(n));
  1013.  
  1014. // only primes >= 5
  1015. std::vector<uint64_t> primes; // 5,7,11
  1016. std::vector<std::array<uint64_t, 2> > poffs;// {{3,0},{0,5},{8,1}};
  1017. primes.reserve(ceil(1.25506*store_max/log(store_max)));
  1018. poffs.reserve(ceil(1.25506*store_max/log(store_max)));
  1019. uint64_t seg_start = 0;
  1020. uint64_t seg_min = 5;
  1021. constexpr uint64_t seg_len = 6*buf_size;///wheel_width;
  1022. constexpr uint64_t limit_i_max = 0xfffffffe00000001ULL;
  1023. const uint64_t num_segs = 1+(n-seg_min)/seg_len;
  1024. const uint64_t nj = (n-seg_min)/3+1;
  1025. for(uint64_t seg = 0; seg < num_segs; ++seg)
  1026. {
  1027. std::fill(buf.begin(), buf.end(), 0);
  1028. // mark off small primes
  1029. const uint64_t limit_i = std::min<uint64_t>((((seg_len+seg_min) >= limit_i_max) ?
  1030. std::numeric_limits<uint32_t>::max() :
  1031. ceil(sqrt(seg_len+seg_min))),
  1032. store_max);
  1033. uint64_t max_j = std::min(seg_start+(buf_len<<dtype_width), nj);
  1034. for(uint64_t i = 0; i < primes.size() && primes[i] <= limit_i; ++i)
  1035. {
  1036. mark_composite(buf, primes[i], poffs[i], seg_start, max_j);
  1037. }
  1038. // sieve
  1039. uint64_t val;
  1040. const uint64_t stop = std::min(seg_min+seg_len, n);
  1041. for(uint64_t i = ffsll(~(buf[0]))-((~buf[0]) != 0)+64*((~buf[0]) == 0);
  1042. (val = 6ULL*(i>>1)+seg_min+2ULL*(i&1ULL)) < stop;)
  1043. {
  1044. if(!(buf[i>>dtype_width] & (1ULL << (i & dtype_mask))))
  1045. {
  1046. if(val <= store_max)
  1047. {
  1048. // add prime and poffs
  1049. primes.push_back(val);
  1050. poffs.emplace_back();
  1051. poffs.back()[0] = (val*val-1)/6-1;
  1052. if(i&1)
  1053. {
  1054. // 6n+1 prime
  1055. poffs.back()[1] = (val*val+4*val-5)/6;
  1056. }
  1057. else
  1058. {
  1059. // 6n+5 prime
  1060. poffs.back()[1] = (val*val+2*val-5)/6;
  1061. }
  1062. // mark-off multiples
  1063. mark_composite(buf, val, poffs.back(), seg_start, max_j);
  1064. }
  1065. ++res;
  1066. ++i;
  1067. }
  1068. else
  1069. {
  1070. uint64_t mask = buf[i>>dtype_width]>>(i&dtype_mask);
  1071. const int64_t inc = ffsll(~mask)-((~mask) != 0)+64*((~mask) == 0);
  1072. i += inc;
  1073. }
  1074. }
  1075. seg_min += seg_len;
  1076. seg_start += buf_size*wheel_width;
  1077. }
  1078. }
  1079. return res;
  1080. }
  1081.  
  1082. int main(int argc, char** argv)
  1083. {
  1084. if(argc <= 1)
  1085. {
  1086. std::cout << "usage: " << argv[0] << " nn";
  1087. return -1;
  1088. }
  1089. std::cout << num_primes(std::stoll(argv[1])) << 'n';
  1090. }
  1091.  
  1092. g++ -std=c++11 -o sieve -O3 -march=native sieve.cpp
  1093.  
  1094. 41500
  1095. 4339
  1096.  
  1097. real 0m0.001s
  1098. user 0m0.000s
  1099. sys 0m0.000s
  1100.  
  1101. 24850000
  1102. 1557132
  1103.  
  1104. real 0m0.036s
  1105. user 0m0.032s
  1106. sys 0m0.000s
  1107.  
  1108. 40550000
  1109. 2465109
  1110.  
  1111. real 0m0.056s
  1112. user 0m0.052s
  1113. sys 0m0.000s
  1114.  
  1115. 99820000
  1116. 5751639
  1117.  
  1118. real 0m0.149s
  1119. user 0m0.144s
  1120. sys 0m0.000s
  1121.  
  1122. 660000000
  1123. 34286170
  1124.  
  1125. real 0m0.795s
  1126. user 0m0.788s
  1127. sys 0m0.000s
  1128.  
  1129. 1240000000
  1130. 62366021
  1131.  
  1132. real 0m1.468s
  1133. user 0m1.464s
  1134. sys 0m0.000s
  1135.  
  1136. 1337000000
  1137. 66990613
  1138.  
  1139. real 0m1.583s
  1140. user 0m1.576s
  1141. sys 0m0.004s
  1142.  
  1143. 1907000000
  1144. 93875448
  1145.  
  1146. real 0m2.363s
  1147. user 0m2.356s
  1148. sys 0m0.000s
  1149.  
  1150. real 0m9.415s
  1151. user 0m9.414s
  1152. sys 0m0.014s
  1153.  
  1154. real 0m9.315s
  1155. user 0m9.315s
  1156. sys 0m0.013s
  1157.  
  1158. real 0m9.307s
  1159. user 0m9.309s
  1160. sys 0m0.012s
  1161.  
  1162. real 0m9.333s
  1163. user 0m9.330s
  1164. sys 0m0.017s
  1165.  
  1166. real 0m9.288s
  1167. user 0m9.289s
  1168. sys 0m0.012s
  1169.  
  1170. real 0m9.319s
  1171. user 0m9.318s
  1172. sys 0m0.015s
  1173.  
  1174. real 0m9.285s
  1175. user 0m9.284s
  1176. sys 0m0.015s
  1177.  
  1178. real 0m9.342s
  1179. user 0m9.342s
  1180. sys 0m0.014s
  1181.  
  1182. real 0m9.305s
  1183. user 0m9.305s
  1184. sys 0m0.014s
  1185.  
  1186. real 0m9.312s
  1187. user 0m9.313s
  1188. sys 0m0.012s
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement