I’ve been working on a rather large web application which is responsible for combining data from a variety of sources and presenting the data to the end user in a clean, unified fashion. During this process we sometimes run into cases where multiple related calls are made, each to perform some transformative work on a single set of data. We decided these calls could be made in a more parallel fashion and as such started looking into ways of parallelizing PHP so that relatively expensive operations could be performed at the same time and then the results combined in the end.

We examined a few possible solutions such as Gearman, popen, and multi curl. However all of these methods seemed to require more overhead than they were worth. What I really wanted to see was something more along the lines of POSIX threads to distribute the work load and shared memory for passing data between the parent and child threads.

After some searching through PHP extensions and the official documentation I ran across PHP’s Process Control Extensions suite which contains PCNTL functions, one of which is pcntl_fork. Combined with PHP’s Shared Memory Functions, this promises to fit the bill of inexpensive distribution of processing tasks along with low-overhead inter process communication.

Here is a sample proof-of-concept script. I’ll outline what it does below:

$data = array();

echo "Parent PID: ".getmypid().PHP_EOL;

function forkTest(array &$data) {
	$pids = array();

	$parent_pid = getmypid();

	for($i = 0; $i < 10; $i++) { 		
		if(getmypid() == $parent_pid) { 			
			$pids[] = pcntl_fork(); 			
			echo "Forking child, \$pids now has ".count($pids)." elements".PHP_EOL; 		
		} 	
	} 	
	
	if (getmypid() == $parent_pid) { 		 
		/* Parent thread */		
		echo "Hello from parent: ".getmypid().PHP_EOL; 		 
		array_push($data, "parent".getmypid()); 		  		 
		
		/* Process childrens' results as they exit */
		while(count($pids) > 0) {
			$pid = pcntl_waitpid(-1, $status);
			echo "Attempting to open memory with pid: ".$pid.PHP_EOL;
			$shm_id = shmop_open($pid, "a", 0, 0);

			$shm_data = unserialize(shmop_read($shm_id, 0, shmop_size($shm_id)));
			shmop_delete($shm_id);
			shmop_close($shm_id);
			
			$data = array_merge($data, $shm_data);

			/* Hunt down and remove pid entry */
			foreach($pids as $key => $tpid) {
				if($pid == $tpid) unset($pids[$key]);
			}
		}

		echo "All children exited, \$data now has:".count($data)." elements".PHP_EOL;
		$pids = array();
	} else {
		/* Children threads */
		$pdata = array();
		echo "Hello from child: ".getmypid().PHP_EOL;
		array_push($pdata, "child".getmypid());
		$data_str = serialize($pdata);

		$shm_id = shmop_open(getmypid(), "c", 0644, strlen($data_str));
		if (!$shm_id) {
			echo "Couldn't create shared memory segment".PHP_EOL;
		} else {
			if(shmop_write($shm_id, $data_str, 0) != strlen($data_str)) {
				echo "Couldn't write shared memory data".PHP_EOL;
			}
		}

		sleep(rand(1,10));
		exit(0);
	}
}

/* Run the test 10 times */
for($f = 0; $f < 10; $f++) {
	echo "Running $f forkTest()".PHP_EOL;
	forkTest($data);
}

echo "Fork test finished, \$data now contains ".count($data)." elements".PHP_EOL;
echo "\$data:".PHP_EOL.json_encode($data);

This code describes a function that spawns 10 child worker threads, each of which gets a reference to the global $data array. Each child thread pushes a string element containing the child thread’s process identifier into the array, serializes it, and then places it into a shared memory slot with the process id serving as the shared memory id. The parent process waits for each child thread to exit, gathers the data from shared memory, clears the shared memory, and then combines the results into the master $data array. My test application runs through this function 10 times to demonstrate how forking in PHP can be safe and memory efficient. The result should be a $data array with 110 elements in it. I’ve thrown in sleep commands with a random time between 1 and 10 seconds to show how threads can return at different times.

No doubt optimizations can be made but this should serve as at least a rudimentary example of true and efficient threading in PHP. Well, provided that the work you are planning on doing is worth the overhead (which, small as it may be, still exists and should be factored in) and provided that you do not mind locking your application down to a POSIX environment (meaning the above code will not work on windows platforms).

No related content found.

Share/Save