Monthly Archives: January 2015

Pangram (holoalphabetic) detection

A pangram (Greek: pan gramma, “every letter”) or holoalphabetic sentence for a given alphabet is a sentence using every letter of the alphabet at least once.  For example: “The quick brown fox jumps over the lazy dog.”

pangram

Pangram

In this post, I will be discussing two methods to detect whether a given string is a pangram or not. The first method to detect a pangram would be employing an array to store the occurrence status and the second method to detect a pangram would use a single integer variable to store the occurrence status in its individual bits using bit-wise operators.

Pangram detection using an array

class Pangram{
    public static boolean isPangram( String str ){
        boolean status[] = new boolean[26];
        char ch;
        for(int i=0; i<str.length();i++){
            ch=str.charAt(i);
            if(Character.isLetter(ch)){
                ch=Character.toUpperCase(str.charAt(i));
                status[ch-65]=true;
            }
        }
        for(int i=0;i<status.length;i++){
            if(!status[i]) return false;
        }
        return true;
    }
    public static void main( String args[] ){
        System.out.println( isPangram1("abcABC") );
        System.out.println( isPangram1("ABCDEFGHIJKLMNOPQRSTUVWXYZ") );
        System.out.println( isPangram1("The Quick Brown Fox Jumps Over The Lazy Dog") );
    }
}

The concept in this approach is to use a boolean array as a collection of flag variables to store the occurrence status of each alphabet. The string would not be a pangram if there is any flag location storing false if all the flags locations store true, then the string would be a pangram.

Pangram detection using an integer variable through the bit-wise operator

class Pangram{
    public static boolean isPangram( String str ){
        int status = 0;
        char ch;
        for( int i = 0; i < str.length(); i++ ){
             ch=str.charAt( i );
             if( ch >= 'A' && ch <= 'Z' ) {
                status |= 1 << ( ch - 'A' );
             }else if( ch >= 'a' && ch <= 'z' ){
                status |= 1 << ( ch - 'a' );
            }
            if( status == 0b11111111111111111111111111 ) return true;
        }
        return false;
    }
    public static void main( String args[] ){
        System.out.println( isPangram1("abcABC") );
        System.out.println( isPangram1("ABCDEFGHIJKLMNOPQRSTUVWXYZ") );
        System.out.println( isPangram1("The Quick Brown Fox Jumps Over The Lazy Dog") );
    }
}

In this approach of finding pangram, we are using the individual bits of an integer variable (integer variables are of 4 bytes or 32 bits) to store the occurrence status of each alphabet. Last bit is being used for storing the presence (or absence) of ‘A’ or ‘a’, second last bit is being used for storing the presence (or absence) of ‘B’ or ‘b’ and so on. The for loop is for extracting each character of a string sequentially. The extracted character is then checked for being an uppercase or lowercase character. We are subtracting the value of ‘A’ (65) or ‘a’ (97) so as the get a value 0 for both ‘A’ and ‘a’, value 1 for both ‘B’ and ‘b’ and so on. The left shift operator (<<) is then used to shift the one to its proper place; meaning 1 for ‘A’ or ‘a’, 10 for ‘B’ or ‘b’, 100 for ‘C’ or ‘c’ and so on. We then bitwise OR this value into the status variable so as to set the corresponding bit 1 at the required position. As soon as the status variable is equal to 0b11111111111111111111111111 (0b  for binary) that is last twenty-six bit become 1 we return true since that indicates that all letters have occurred at least once. In case 0b11111111111111111111111111 is not encountered and the string is over it means that the sentence is not a pangram. The complexity of this algorithm is O(N). Also, note that the memory requirement is significantly lower than the earlier algorithm using a boolean array.