Because I need to extract data from a huge amount pdf files. Part of pdf been encrypted can not extract the data correctly.
Even if I was follow the steps of algorithm 3.1 ~ 3.3 of section 3.5 of pdf_reference_1-7. Still can not decrypt the obj stream.
one of pdf sample is :
trailer
<<
/Size 89
/Info 48 0 R
/Encrypt 55 0 R
/Root 54 0 R
/Prev 66413
/ID[<7d8b7b91c0cf4524553df76699e9c489><7d8b7b91c0cf4524553df76699e9c48 9>]
>>
...
55 0 obj
<<
/Filter /Standard
/R 2
/O (駰箖5-?絖
袛苳? }m * ? 骨)
/U (? 鮯q|冢x??/k E?p9E邋 齕??)
/P -60
/V 1
>>
Have any problem in my steps at below?
private class EncryptObj{
String ID0="";
int R=-1; //keep the entry of /R of 55 0 obj
int V=-1; //keep the entry of /V of 55 0 obj
int P=-1;
String Filter=""; //keep the entry of /Filter of 55 0 obj
byte[] O; //keep the entry of /O of 55 0 obj
byte[] U; //keep the entry of /U of 55 0 obj
byte[] OValue;
byte[] UValue;
byte[] EncryptionKey=null;
}
private EncryptObj eo=new EncrptObj();
private void GetEncryptionKey(){
eo.OValue=ComputeOValue();
eo.EncryptionKey=ComputeEncryptionKey(eo.U);
RC4 rc4=new RC4(eo.EncryptionKey);
eo.UValue=rc4.Encrypt(paddingString);
eo.EncryptionKey=ComputeEncryptionKey(eo.UValue);
}
private byte[] ComputeOValue(){
try{
byte[] b32=PaddingTo32Bytes(eo.O);
MessageDigest md = MessageDigest.getInstance("MD5");
md.update(b32);
byte[] digest=md.digest();
byte[] tEncryptionKey=new byte[5];
System.arraycopy(digest, 0, tEncryptionKey, 0, tEncryptionKey.length);
RC4 rc4=new RC4(tEncryptionKey);
b32=PaddingTo32Bytes(eo.U);
return rc4.Encrypt(b32);
}catch(NoSuchAlgorithmException ex){
throw new RuntimeException(ex);
}
}
private byte[] PaddingTo32Bytes(byte[] inba){
byte[] b32=new byte[32];
if(inba.length>=32){
System.arraycopy(inba, 0, b32, 0, 32);
}else{
System.arraycopy(inba, 0, b32, 0, inba.length);
System.arraycopy(paddingString, 0, b32, inba.length, 32-inba.length);
}
return b32;
}
private byte[] ComputeEncryptionKey(byte[] pw){
byte[] EncryptionKey;
int len=0;
try{
MessageDigest md = MessageDigest.getInstance("MD5");
ByteBuffer bb=ByteBuffer.allocate(128);
bb.put(PaddingTo32Bytes(pw));
len+=32;
bb.put(eo.OValue);
len+=eo.OValue.length;
byte[] P = ByteBuffer.allocate(4).putInt(eo.P).array();
byte[] P_lower_order=new byte[P.length];
for(int i=0;i<P.length;i++){
P_lower_order[i]=P[P.length-1-i];
}
bb.put(P_lower_order);
len+=P_lower_order.length;
byte[] ID0=HexToBytes(eo.ID0.substring(1, eo.ID0.length()-1).getBytes());
bb.put(ID0);
len+=ID0.length;
byte[] inBytes=new byte[len];
for(int i=0;i<len;i++){
inBytes[i]=bb.get(i);
}
byte[] digest=md.digest(inBytes);
EncryptionKey=new byte[5];
System.arraycopy(digest, 0, EncryptionKey, 0, EncryptionKey.length);
}catch(NoSuchAlgorithmException ex){
throw new RuntimeException(ex);
}
return EncryptionKey;
}
private byte[] HexToBytes(byte[] inhex){
byte[] outb=new byte[inhex.length/2];
if ((inhex.length%2)==1){
return null;
}
String hex=ByteArrayToString(inhex).toLowerCase();
String numberS="0123456789abcdef";
int ti, m, n;
for(int i=0;i<outb.length;i++){
if((m=numberS.indexOf(hex.substring(2*i, 2*i+1)))==-1){
return null;
}
if((n=numberS.indexOf(hex.substring(2*i+1, 2*i+2)))==-1){
return null;
}
ti=m*16+n;
outb[i]=(byte)ti;
}
return outb;
}
I get EncrptionKey -- eo.EncryptionKey after call GetEncryptionKey().
Then I follow the Algorithm 3.1 use eo.EncryptionKey + obj number (3bytes LOB) + generation number (2bytes LOB)
pass to MD5. Get first 10 bytes from output of MD5 as the key of RC4 to decrypt the obj stream.
But error happen :
Exception in thread "main" java.lang.RuntimeException: com.sun.pdfview.PDFParseException: Data format exception:incorrect header check
Please help me. Thanks a lot.