I, ME AND MYSELF !!!: sieve

Showing posts with label sieve. Show all posts

Tuesday, April 2, 2013

SPOJ FINDPRM

SPOJ Problem Set (Classical) 5970. Finding Primes

Fun problem! First it wants you to run a sieve from 2 to N, and then do the similar operations starting from N to 2, except, this time, you have to mark the factors of N and then find the next largest N1 not yet marked, then mark factors of N1 and so on... Finally, you have to tell how many numbers will be unmarked by both algorithms up to N.

First observation is, you only need to consider those numbers which are primes, so, we can run a sieve up to 10⁷, which will allow us to test whether a number is prime or not in O(1) time.

Now, for each N, we can pre-calculate the result. Just note that, say if you know ans[n-1] and want to determine ans[n] from it, this inequality will hold |ans[n] - ans[n-1]| <= 1. Because the current N will be added as an unmarked, or there may be a number which you added in the past, will be removed by one of the factors of this N. Or, if N is not a prime, then it will be already removed by the first algorithm i.e. sieve.

If N is a prime number, then no N1 has removed this N already, so in this case the ans[n] = ans[n-1] + 1.

If N is an even number, and if N/2 was a prime, you have added that with your result, but this should be removed at this stage. So in such a case, ans[n] = ans[n-1].

Why don't you need to consider other factors like N/3, N/5 ... ? 2 is the only even prime factor and it can produce even numbers by multiplying other primes with it. For other primes, p = 3, 5 and so on, if N/p is prime, then N cannot be even. So the basic idea is as follows:

 for( i = 2; i <= N; i++ ) {     ans[i] = ans[i - 1];     if( i is even and i/2 is prime ) ans[i] = ans[i] - 1;     if( i is prime ) ans[i] = ans[i] + 1; }

I had absolutely no clue at the first glance. Really nice problem.

Thursday, February 28, 2013

Euler Totient / φ Function

Euler Totient / Phi Function φ(n) counts the number of positive integers less than or equal to n which are relatively prime to n, i.e. do not have any common divisor with n except 1.

Formula for Euler Phi function:

Here, the product is over the distinct prime numbers which divide n. Now, you can just factorize n and calculate φ(n) pretty easily. But, will that be efficient for a task such as you are asked to find φ(n) over a range of integers?

If you look closely to the formula, you will see that, we multiply n with (p-1)/p for each prime p that divides n. Now recall what do we do when we run Sieve of Eratosthenes for marking primes / non-primes. On the outer loop of the sieve, we determine if a number is prime, then in inner loop, instead of setting flags, if we can keep multiplying the number with (p-1)/p where p is the current prime number from outer loop, at the end of the iterations, we can actually generate φ(n) for each n over the range we are asked for. Here is a source code example that does exactly the same thing. Just take a look and try to understand what are the loops doing here, and how we are performing calculation and storing results.

 #include <cstdio>  const int MAX = 1000001; int phi[MAX];  void euler_phi() {     phi[1] = 1;     for(int i=2; i<MAX; i++) {         if(!phi[i]) {             phi[i] = i-1;             for(int j=(i<<1); j<MAX; j+=i) {                 if(!phi[j]) phi[j] = j;                 phi[j] = phi[j]/i*(i-1);             }         }     } }  int main() {     euler_phi();     for(int t=1; t<MAX; t++) printf("%d\n", phi[t]);     return 0; }

A bit of explanation on what we are doing here: Initially the phi[] array is set to 0 (as it is declared global). We know that, phi[1] = 1 and phi[n] = n-1 when n is a prime number. So, similar to sieve algorithm, first we check if current number is prime in the outer loop, if phi[i] = 0, it means i is prime. So, we update it with i-1 accordingly. Now, for all the multiples of i greater than i, which starts from 2*i, calling it j in inner loop, we need to multiply phi[j] by (i-1) / i. Here, we first check if phi[j] is 0, i.e. visiting it for the first time, in this case we set it with j, as I said earlier that, for φ(n) we will multiply n with (p-1)/p where p are the primes that divide n. Also, one thing to note here: a * b / c and a / c * b are not always same for integer calculation unless you can assure that c divides a. In this case it does, why? cause this is basically a prime factoring algorithm, and c = i here. As we are traversing i's multiples, it is guaranteed that a=phi[j] can be divided by c=i and instead of a * b / c format, we will always use a / c * b in these types of situations, because it will help preventing overflow many times.

Now, think about the optimizations we could apply here, and try applying them, like discarding even numbers, starting inner loop from squares, increment inner loop twice the prime number each time, won't work here. Because, we need to go through every numbers in inner loop, as we are trying to find Totient function for every n in the range 1 to MAX. Test the code on smaller range, and try to check if it is doing this correctly, play around with it.

Tuesday, February 26, 2013

Divisor Function

Prime Factorization and Divisor Function σ_x(n)

σ_x(n) is defined as the sum of x^th powers of the distinct positive divisors of n. The function can be expressed as:

Here, r = ω(n), which is the number of distinct positive prime factors of n. p_i for i = 1 to r, are the prime factors and a_i is the maximum power of p_iby which n is divisible.

Clearly, this function can be used for various problems, for example, when x = 0, it simply means the number of distinct positive divisors of n, if x = 1, it is the sum of distinct positive divisors of n, for x = 2, its the sum of squares of positive divisors of n and so on.

For programming contest practice, there are a few problems that requires sum of divisors, or number of divisors, which can be calculated by simply calculating the prime factors with their count, and then evaluating the function shown above with appropriate value on x. Also, the form of function definition can be changed when you set a value for x. After putting x = 0, we get this:

So this is easy, you just need to find frequency of each prime, and then multiply each (a_i + 1) for all primes, you get the number of divisors of n.

For sum of divisor, the idea is similar, here x = 1. The following code shows how to find sum of distinct positive divisors of n:

 #include <cstdio> #include <cmath> using namespace std;  #define sq(x) ((x)*(x)) #define i64 unsigned long long #define MAX 784 #define LMT 28  unsigned flag[MAX/64]; unsigned primes[5761460], total;  #define chkC(n) (flag[n>>6]&(1<<((n>>1)&31))) #define setC(n) (flag[n>>6]|=(1<<((n>>1)&31)))  /* Regular sieve of eratosthenes, bitwise implementation */ void sieve() {     unsigned i, j, k;     flag[0]|=0;     for(i=3;i<LMT;i+=2)         if(!chkC(i))             for(j=i*i,k=i<<1;j<MAX;j+=k)                 setC(j);     primes[(j=0)++] = 2;     for(i=3;i<MAX;i+=2)         if(!chkC(i))             primes[j++] = i;     total = j; }  /* finds n^p in log(p) time */ i64 power(unsigned n, unsigned p) {     i64 x=1, y=n;     while(p > 0)     {         if(p&1) x *= y;         y *= y;         p >>= 1;     }     return x; }  /* calculates the sigma(1) function, we don't need to find prime frequencies. */ inline void update(i64 &sigma1, i64 n, unsigned p) {     if(p==1) sigma1 *= (n+1);     else sigma1 *= ((power(n,p+1)-1)/(n-1)); }  /* Factorization function, we do not need to store the primes here, instead, whenever a prime is found, you update corresponding prime and frequency with sigma 1 */ void factor(i64 n, i64 &sigma1) {     unsigned i, v;     i64 t;     for(i=0, t=primes[i]; i<total && t*t <= n; t = primes[++i])     {         if(n % t == 0)         {             v = 0;             while(n % t == 0)             {                 v++;                 n /= t;             }             update(sigma1, primes[i], v);         }     }     if(n>1) update(sigma1, n, 1); }  /* Our beloved main function */ int main() {     int t, x;     i64 n, sigma1;     sieve();     scanf("%d", &t);     for(x=1; x<=t; x++)     {         scanf("%llu", &n);         factor(n, sigma1);         printf("%llu\n",sigma1);     }     return 0; }

We just need the values of σ, so here we will not store the prime factors. Some similar problem would be to find the number of odd divisors of n, or testing if a number is square free, i.e. no perfect square divides n, going to leave these as an exercise for readers.

A closer look to σ function from wikipedia.

Monday, September 14, 2009

Segmented Sieve

Memory and time efficient :)

Problem Statement:

Your are given two integers a and b. You have to find all the primes within range a and b. Here, 1 ≤ a ≤ b ≤ 2³¹-1 and b - a ≤ 10⁵.

Note: You have to handle 1, 2 and even numbers for appropriate case of your own.

Solution:


#include <string.h>

#define MAX 46656
#define LMT 216
#define LEN 4830
#define RNG 100032

unsigned base[MAX/64], segment[RNG/64], primes[LEN];

#define sq(x) ((x)*(x))
#define mset(x,v) memset(x,v,sizeof(x))
#define chkC(x,n) (x[n>>6]&(1<<((n>>1)&31)))
#define setC(x,n) (x[n>>6]|=(1<<((n>>1)&31)))

/* Generates all the necessary prime numbers and marks them in base[]*/
void sieve()
{
    unsigned i, j, k;
    for(i=3; i<LMT; i+=2)
        if(!chkC(base, i))
            for(j=i*i, k=i<<1; j<MAX; j+=k)
                setC(base, j);
    for(i=3, j=0; i<MAX; i+=2)
        if(!chkC(base, i))
            primes[j++] = i;
}

/* Returns the prime-count within range [a,b] and marks them in segment[] */
int segmented_sieve(int a, int b)
{
    unsigned i, j, k, cnt = (a<=2 && 2<=b)? 1 : 0;
    if(b<2) return 0;
    if(a<3) a = 3;
    if(a%2==0) a++;
    mset(segment,0);
    for(i=0; sq(primes[i])<=b; i++)
    {
        j = primes[i] * ( (a+primes[i]-1) / primes[i] );
        if(j%2==0) j += primes[i];
        for(k=primes[i]<<1; j<=b; j+=k)
            if(j!=primes[i])
                setC(segment, (j-a));
    }
    for(i=0; i<=b-a; i+=2)
        if(!chkC(segment, i))
            cnt++;
    return cnt;
}

This is a sample program which demonstrates segmented sieve. Very fast and memory efficient version. 'base' is the array which holds the flags for all the primes upto √(2³¹-1), i.e. the square-root of the max limit, and all the primes are stored in the 'primes' array. Later, these primes are used to determine whether a number is a composite or not within a certain range in the segmented sieve. To avoid overflow and sign bit problems, unsigned type is used.

A little explanation:

First of what what these macros mean?
#define MAX 46656
#define LMT 216
#define LEN 4830
#define RNG 100032

MAX is the sqrt of maximum possible input, in case of here, the maximum is integer range sqrt of which is almost MAX used here. So, MAX is not maximum allowed input, it is just sqrt of maximum input which is pretty big as 2147483647 i.e. 32 bit signed integer maximum.
LMT is sqrt of MAX. We all know, we run sieve upto sqrt MAX
LEN is the maximum possible different primes that can be stored using this algorithm with specific range defined as RNG, on which the segmented sieve will run and collect the primes out of it.

Now the next two vital macros:
#define chkC(x,n) (x[n>>6]&(1<<((n>>1)&31)))
#define setC(x,n) (x[n>>6]|=(1<<((n>>1)&31)))
And why we divide by 64:

Ok, yes, it is clearly bit shifting. But you must know what we do in bitwise sieve first in order to get this. Instead of using a whole array position to store just one flag, we can use its each 32 bits to store one flag, which saves memory by a factor 1/32. So its very logical to capture memory upto MAX/32, but why MAX/64 and RNG/64 here?
Because, we really have no point of handling the even number as 2 is the only even prime and we can handle it manually, without any stress. So, if we do not consider the even numbers at all, the total numbers are again reduced by a factor 1/2, isn't it? So what we get total is MAX/32/2 = MAX/64, same for RNG.

Now, the two macros chkC and setC is pretty straight forward. chkC checks if a specific bitflag is 1 or 0, and setC sets a specific bitflag 1 to mark it as a composite. They work similarly, so I will explain only the chkC part.

In bitwise sieve, where is a specific value n located? Obviously (n/32)^th index, and on that index, (n%32)^th bit from LSB (right hand side). But we just said before, we are interested with only odd numbers, so we map n with n/2 as follows:


Actual numbers
 1   2   3   4   5   6   7   8   9 ...........   n     ;[n is odd]
 |   |   |   |   |   |   |   |   | ...........   |
 0   x   1   x   2   x   3   x   4 ........... (n/2)
Position on which they are represented in the bitstrings.

So, n is actually n/2, which implies the previous statement: n's (actually n/2 's) position is in index [n/2/32] = [n/64] = [n>>6] and on the bit position (n/2)%32 = (n>>1)&31 ;[ We know, modding with a power of 2 is same as ANDing with (same power of 2)-1 ]. The rest is, how we check / set this specific bit. I have another post explaining these operations: Bitwise operations in C: Part 3, and obviously you can google it :)