Parsing X.509 Certificates with OpenSSL and C
Zakir Durumeric | October 13, 2013
While OpenSSL has become one of the defacto libraries for performing SSL and TLS operations, the library is surprisingly opaque and its documentation is, at times, abysmal. As part of our recent research, we have been performing Internet-wide scans of HTTPS hosts in order to better understand the HTTPS ecosystem (Analysis of the HTTPS Certificate Ecosystem, ZMap: Fast Internet-Wide Scanning and its Security Applications). We use OpenSSL for many of these operations including parsing X.509 certificates. However, in order to parse and validate certificates, our team had to dig through parts of the OpenSSL code base and multiple sources of documention to find the correct functions to parse each piece of data. This post is intended to document many of these operations in a single location in order to hopefully alleviate this painful process for others.
If you have found other pieces of code particularly helpful, please don’t hesitate to send them alongand we’ll update the post. I want to note that if you’re starting to develop against OpenSSL, O’Reilly’sNetwork Security with OpenSSL is an incredibly helpful resource; the book contains many snippets and pieces of documentation that I was not able to find anywhere online. I also want to thank James Kastenwho helped find and document several of these solutions.
Creating an OpenSSL X509 Object
All of the operations we discuss start with either a single X.509 certificate or a “stack” of certificates. OpenSSL represents a single certificate with an X509
struct and a list of certificates, such as the certificate chain presented during a TLS handshake as a STACK_OF(X509)
. Given that the parsing and validation stems from here, it only seems reasonable to start with how to create or access an X509 object. A few common scenarios are:
1. You have initiated an SSL or TLS connection using OpenSSL.
In this case, you have access to an OpenSSL SSL
struct from which you can extract the presented certificate as well as the entire certificate chain that the server presented to the client. In our specific case, we use libevent to perform TLS connections and can access the SSL struct from the libevent bufferevent: SSL *ssl = bufferevent_openssl_get_ssl(bev)
. This will clearly be different depending on how you complete your connection. However, once you have your SSL context, the server certificate and presented chain can be extracted as follows:
#include <openssl/x509.h>
#include <openssl/x509v3.h>
X509 *cert = SSL_get_peer_certificate(ssl);
STACK_OF(X509) *sk = SSL_get_peer_cert_chain(ssl);
We have found that at times, OpenSSL will produce an empty certificate chain (SSL_get_peer_cert_chain
will come back NULL
) even though a client certificate has been presented (the server certificate is generally presented as the first certificate in the stack along with the remaining chain). It’s unclear to us why this happens, but it’s not a deal breaker, as it’s easy to create a new stack of certificates:
X509 *cert = SSL_get_peer_certificate(ssl);
STACK_OF(X509) *sk = sk_X509_new_null();
sk_X509_push(sk, cert);
2. You have stored a certificate on disk as a PEM file.
For reference, a PEM file is the Base64-encoded version of an X.509 certificate, which should look similar to the following:
-----BEGIN CERTIFICATE-----
MIIHIDCCBgigAwIBAgIIMrM8cLO76sYwDQYJKoZIhvcNAQEFBQAwSTELMAkGA1UE
BhMCVVMxEzARBgNVBAoTCkdvb2dsZSBJbmMxJTAjBgNVBAMTHEdvb2dsZSBJbnRl
iftrJvzAOMAPY5b/klZvqH6Ddubg/hUVPkiv4mr5MfWfglCQdFF1EBGNoZSFAU7y
ZkGENAvDmv+5xVCZELeiWA2PoNV4m/SW6NHrF7gz4MwQssqP9dGMbKPOF/D2nxic
TnD5WkGMCWpLgqDWWRoOrt6xf0BPWukQBDMHULlZgXzNtoGlEnwztLlnf0I/WWIS
eBSyDTeFJfopvoqXuws23X486fdKcCAV1n/Nl6y2z+uVvcyTRxY2/jegmV0n0kHf
gfcKzw==
-----END CERTIFICATE-----
In this case, you can access the certificate as follows:
#include <stdio.h>
#include <openssl/x509.h>
#include <openssl/x509v3.h>
FILE *fp = fopen(path, "r");
if (!fp) {
fprintf(stderr, "unable to open: %s
", path);
return EXIT_FAILURE;
}
X509 *cert = PEM_read_X509(fp, NULL, NULL, NULL);
if (!cert) {
fprintf(stderr, "unable to parse certificate in: %s
", path);
fclose(fp);
return EXIT_FAILURE;
}
// any additional processing would go here..
X509_free(cert);
fclose(fp);
3. You have access to the raw certificate in memory.
In the case that you have access to the raw encoding of the certificate in memory, you can parse it as follows. This is useful if you have stored raw certificates in a database or similar data store.
#include <openssl/x509.h>
#include <openssl/x509v3.h>
#include <openssl/bio.h>
const unsigned char *data = ... ;
size_t len = ... ;
X509 *cert = d2i_X509(NULL, &data, len);
if (!cert) {
fprintf(stderr, "unable to parse certificate in memory
");
return EXIT_FAILURE;
}
// any additional processing would go here..
X509_free(cert);
4. You have access to the Base64 encoded PEM in memory.
char* pemCertString = ..... (includes "-----BEGIN/END CERTIFICATE-----")
size_t certLen = strlen(pemCertString);
BIO* certBio = BIO_new(BIO_s_mem());
BIO_write(certBio, pemCertString, certLen);
X509* certX509 = PEM_read_bio_X509(certBio, NULL, NULL, NULL);
if (!certX509) {
fprintf(stderr, "unable to parse certificate in memory
");
return EXIT_FAILURE;
}
// do stuff
BIO_free(certBio);
X509_free(certX509);
Parsing Certificates
Now that we have access to a certificate in OpenSSL, we’ll focus on how to extract useful data from the certificate. We don’t include the #include
s in every statement, but use the following headers throughout our codebase:
#include <openssl/x509v3.h>
#include <openssl/bn.h>
#include <openssl/asn1.h>
#include <openssl/x509.h>
#include <openssl/x509_vfy.h>
#include <openssl/pem.h>
#include <openssl/bio.h>
OpenSSL_add_all_algorithms();
You will also need the development versions of the OpenSSL libraries and to compile with -lssl
.
Subject and Issuer
The certificate subject and issuer can be easily extracted and represented as a single string as follows:
char *subj = X509_NAME_oneline(X509_get_subject_name(cert), NULL, 0);
char *issuer = X509_NAME_oneline(X509_get_issuer_name(cert), NULL, 0);
These can be freed by calling OPENSSL_free
.
By default, the subject and issuer are returned in the following form:
/C=US/ST=California/L=Mountain View/O=Google Inc/CN=*.google.com
If you want to convert these into a more traditional looking DN, such as:
C=US, ST=Texas, L=Austin, O=Polycom Inc., OU=Video Division, CN=a.digitalnetbr.net
they can be converted with the following code:
int i, curr_spot = 0;
char *s = tmpBuf + 1; /* skip the first slash */
char *c = s;
while (1) {
if (((*s == '/') && ((s[1] >= 'A') && (s[1] <= 'Z') &&
((s[2] == '=') || ((s[2] >= 'A') && (s[2] <= 'Z')
&& (s[3] == '='))))) || (*s == ' ')) {
i = s - c;
strncpy(destination + curr_spot, c, i);
curr_spot += i;
assert(curr_spot < size);
c = s + 1; /* skip following slash */
if (*s != ' ') {
strncpy(destination + curr_spot, ", ", 2);
curr_spot += 2;
}
}
if (*s == ' ')
break;
++s;
}
It is also possible to extract particular elements from the subject. For example, the following code will iterate over all the values in the subject:
X509_NAME *subj = X509_get_subject_name(cert);
for (int i = 0; i < X509_NAME_entry_count(subj); i++) {
X509_NAME_ENTRY *e = X509_NAME_get_entry(subj, i);
ASN1_STRING *d = X509_NAME_ENTRY_get_data(e);
char *str = ASN1_STRING_data(d);
}
or
for (;;) {
int lastpos = X509_NAME_get_index_by_NID(subj, NID_commonName, lastpos);
if (lastpos == -1)
break;
X509_NAME_ENTRY *e = X509_NAME_get_entry(subj, lastpos);
/* Do something with e */
}
Cryptographic (e.g. SHA-1) Fingerprint
We can calculate the SHA-1 fingerprint (or any other fingerprint) with the following code:
#define SHA1LEN 20
char buf[SHA1LEN];
const EVP_MD *digest = EVP_sha1();
unsigned len;
int rc = X509_digest(cert, digest, (unsigned char*) buf, &len);
if (rc == 0 || len != SHA1LEN) {
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
This will produce the raw fingerprint. This can be converted to the human readable hex version as follows:
void hex_encode(unsigned char* readbuf, void *writebuf, size_t len)
{
for(size_t i=0; i < len; i++) {
char *l = (char*) (2*i + ((intptr_t) writebuf));
sprintf(l, "%02x", readbuf[i]);
}
}
char strbuf[2*SHA1LEN+1];
hex_encode(buf, strbuf, SHA1LEN);
Version
Parsing the certificate version is straight-foward; the only oddity is that it is zero-indexed:
int version = ((int) X509_get_version(cert)) + 1;
Serial Number
Serial numbers can be arbitrarily large as well as positive or negative. As such, we handle it as a string instead of a typical integer in our processing.
#define SERIAL_NUM_LEN 1000;
char serial_number[SERIAL_NUM_LEN+1];
ASN1_INTEGER *serial = X509_get_serialNumber(cert);
BIGNUM *bn = ASN1_INTEGER_to_BN(serial, NULL);
if (!bn) {
fprintf(stderr, "unable to convert ASN1INTEGER to BN
");
return EXIT_FAILURE;
}
char *tmp = BN_bn2dec(bn);
if (!tmp) {
fprintf(stderr, "unable to convert BN to decimal string.
");
BN_free(bn);
return EXIT_FAILURE;
}
if (strlen(tmp) >= len) {
fprintf(stderr, "buffer length shorter than serial number
");
BN_free(bn);
OPENSSL_free(tmp);
return EXIT_FAILURE;
}
strncpy(buf, tmp, len);
BN_free(bn);
OPENSSL_free(tmp);
Signature Algorithm
The signature algorithm on a certificate is stored as an OpenSSSL NID:
int pkey_nid = OBJ_obj2nid(cert->cert_info->key->algor->algorithm);
if (pkey_nid == NID_undef) {
fprintf(stderr, "unable to find specified signature algorithm name.
");
return EXIT_FAILURE;
}
This can be translated into a string representation (either short name or long description):
char sigalgo_name[SIG_ALGO_LEN+1];
const char* sslbuf = OBJ_nid2ln(pkey_nid);
if (strlen(sslbuf) > PUBKEY_ALGO_LEN) {
fprintf(stderr, "public key algorithm name longer than allocated buffer.
");
return EXIT_FAILURE;
}
strncpy(buf, sslbuf, PUBKEY_ALGO_LEN);
This will result in a string such as sha1WithRSAEncryption
or md5WithRSAEncryption
.
Public Key
Parsing the public key on a certificate is type-specific. Here, we provide information on how to extract which type of key is included and to parse RSA and DSA keys:
char pubkey_algoname[PUBKEY_ALGO_LEN];
int pubkey_algonid = OBJ_obj2nid(cert->cert_info->key->algor->algorithm);
if (pubkey_algonid == NID_undef) {
fprintf(stderr, "unable to find specified public key algorithm name.
");
return EXIT_FAILURE;
}
const char* sslbuf = OBJ_nid2ln(pubkey_algonid);
assert(strlen(sslbuf) < PUBKEY_ALGO_LEN);
strncpy(buf, sslbuf, PUBKEY_ALGO_LEN);
if (pubkey_algonid == NID_rsaEncryption || pubkey_algonid == NID_dsa) {
EVP_PKEY *pkey = X509_get_pubkey(cert);
IFNULL_FAIL(pkey, "unable to extract public key from certificate");
RSA *rsa_key;
DSA *dsa_key;
char *rsa_e_dec, *rsa_n_hex, *dsa_p_hex, *dsa_p_hex,
*dsa_q_hex, *dsa_g_hex, *dsa_y_hex;
switch(pubkey_algonid) {
case NID_rsaEncryption:
rsa_key = pkey->pkey.rsa;
IFNULL_FAIL(rsa_key, "unable to extract RSA public key");
rsa_e_dec = BN_bn2dec(rsa_key->e);
IFNULL_FAIL(rsa_e_dec, "unable to extract rsa exponent");
rsa_n_hex = BN_bn2hex(rsa_key->n);
IFNULL_FAIL(rsa_n_hex, "unable to extract rsa modulus");
break;
case NID_dsa:
dsa_key = pkey->pkey.dsa;
IFNULL_FAIL(dsa_key, "unable to extract DSA pkey");
dsa_p_hex = BN_bn2hex(dsa_key->p);
IFNULL_FAIL(dsa_p_hex, "unable to extract DSA p");
dsa_q_hex = BN_bn2hex(dsa_key->q);
IFNULL_FAIL(dsa_q_hex, "unable to extract DSA q");
dsa_g_hex = BN_bn2hex(dsa_key->g);
IFNULL_FAIL(dsa_g_hex, "unable to extract DSA g");
dsa_y_hex = BN_bn2hex(dsa_key->pub_key);
IFNULL_FAIL(dsa_y_hex, "unable to extract DSA y");
break