Using PEAR::mimeDecode to strip email bounce out of its headers

Writing indigenous email header stripping functions involve tedious work and a lot of regex, as the bounce headers email can be encoded. up/down cased, and saving it, we planned to include the mailMimeDecode class – a pretty straightforward approach. Header stripping is quite easy, and it can be installed via composer too.
* Prepare your composer.json

{
    "require":
    {
        "pear/mail_mime-decode": "1.5.5",
        "pear/pear_exception": "1.0.x-dev"
    }
}

* Now, in the mail decode class, add the following to initialize the mimeDecode object

<?php
	// mimeDecode configurations
	$params['include_bodies'] = true;
	$params['decode_bodies'] = true;
	$params['decode_headers'] = true;

	$decoder = new Mail_mimeDecode( $email );
	$structure = $decoder->decode( $params );

	$emailHeaders = $structure->headers;
	
	$to = $emailHeaders[ 'to' ];
	$subject = $emailHeaders[ 'subject' ];
	$emailDate = $emailHeaders[ 'date' ];
	$permanentFailure = $emailHeaders[ 'x-failed-recipients' ];
?>

Now, you have the To, Subject, Date and X-Failed Recipients headers ready. Please note that, you need to just put the lower-cased default email header-name as ‘key’ in the $emailHeaders array, to fetch it.
Yay! Happy Hacking

Writing a Job queue to deal with load when POST-ing from exim to MediaWiki API

Last day, Tim Landscheidt from Wikimedia scribbled on my earlier post that I should use a job queue to handle load of the bounce handling API. I talked with Legoktm on this, and he said it was a great idea, as there can be a chance of multiple email bounces reaching the API simultaneously. I will jot down how we made that happen.
*Firstly, register and load the Job handler class

//Register and Load Jobs
$wgAutoloadClasses['BounceHandlerJob'] = $dir. '/includes/job/BounceHandlerJob.php';
$wgJobClasses['BounceHandlerJob'] = 'BounceHandlerJob';

* Now, create the file BounceHandlerJob.php extending class BounceHandlerJob from Job.
I wanted to get the $email, which will be passed from ApiBounceHandler::exectue();

class BounceHandlerJob extends Job {
	public function __construct( Title $title, array $params ) {
		parent::__construct( 'BounceHandlerJob', $title, $params );
	}

	/**
	 * Queue Some more jobs
	 * @return bool
	 */
	public function run() {
		$email = $this->params[ 'email' ];

		if ( $email ) {
			// The function in the API where the header 
			// stripping and other stuff happen
			ApiBounceHandler::processEmail( $email );
		}

		return true;
	}
}

* Now, we need to make the APIBounceHandler class to receive the POST request, and create a Job queue object so that our objective is accomplished.

$email = $this->getMain()->getVal( 'email' );
$params = array ( 'email' => $email );
$title = Title::newFromText( 'BounceHandler Job' );
$job = new BounceHandlerJob( $title, $params );
JobQueueGroup::singleton()->push( $job );

Yay ! done. now the public static processEMAIL function needs to be defined with the necessary actions and you are good to go.
PS: Please add the following to LocalSettings.php to see the results.

$wgRunJobRate = 0;

To ensure things are working well, run the script

php runJobs.php

. Correct errors if any, and the perfect run will output something like.

2014-07-13 15:02:38 BounceHandlerJob BounceHandler_Job email=string(1862) STARTING
2014-07-13 15:02:38 BounceHandlerJob BounceHandler_Job email=string(1862) t=100 good

. Thanks

Parsing email for relevant headers

Often people love to show off their regex skills, but makes a lot of confusion to people like me trying to understand whats being grepped out.
Problem :
* Extract the To, Subject and Date from an email, received by POST or whatever.
Solution :

/**
  * Extract the required headers from the received email
  *
  * @param $email
  * @return string
  */
protected function getHeaders( $email ) {
	$emailLines = explode( "\n", $email );
	foreach ( $emailLines as $emailLine ) {
		if ( preg_match( "/^To: (.*)/", $emailLine, $toMatch ) ) {
			$headers[ 'to' ] = $toMatch[1];
		}
		if ( preg_match( "/^Subject: (.*)/", $emailLine, $subjectMatch ) ) {
			$headers[ 'subject' ] = $subjectMatch[1];
		}
		if ( preg_match( "/^Date: (.*)/", $emailLine, $dateMatch ) ) {
			$headers[ 'date' ] = $dateMatch[1];
		}
		if ( trim( $emailLine ) == "" ) {
			// Empty line denotes that the header part is finished
			break;
		}
	}
	return $headers;
}

.
Now the headers can be used in the calling function by

$to = $emailHeaders[ 'to' ];
$subject = $emailHeaders[ 'subject' ];
$emailDate = $emailHeaders[ 'date' ];

Used here (https://github.com/wikimedia/mediawiki-extensions-BounceHandler).
PS: Please note that Date timestamp will be of the type RFC 2822. (Tue, 17 Jun 2014 05:53:13 GMT ). Happy Hacking again!