Using Action Scheduler to process large data in multiple small batches

Action Scheduler is a handy tool for scheduling tasks to be done in the future. You can use it to handle big tasks, like updating the prices of thousands of products, changing the status of orders, or performing any large-scale task on your site. In this tutorial, we are going to update the price of all the products on our website by 10%.

Basics:

Imagine Action Scheduler as a more advanced version of the do_action function. While do_action executes code immediately, as_schedule_single_action arranges for the code to run at a specific future time.

as_schedule_single_action( $timestamp, $hook, $args, $group, $unique, $priority );

Lets Create our class

We’ll start by creating a class that holds our logic. Here’s a basic outline of what it might look like:

class PJ_Product_Price_Updater {

	/**
	 * Construct.
	 */
	public function __construct() {
		add_action( 'pj_increase_prices', array( $this, 'run_single_step' ) );
	}

	/**
	 * Call this function to start the data processing. This will be starting point.
	 */
	public function start_process() {
		if ( get_option( 'pj_price_increase_completed' ) ) {
			// Process has completed, dont start again.
			return;
		}

		$this->run_single_step();
	}

	/**
	 * Run step.
	 */
	public function run_single_step() {
		$products = $this->get_products_to_be_processed();

		if ( empty( $products ) ) {
			// There are not products lefts for processing.
			update_option( 'pj_price_increase_completed', true );
		} else {
			$this->increase_prices( $products );
			// `pj_perform_data_migration` hook will keep calling this function until
			// all the products have been updated.
			as_schedule_single_action( time() + 1, 'pj_increase_prices' );
		}
	}
}

The as_schedule_single_action function calls the given hook at the specified time. In this case, we’re asking it to call the hook pj_increase_prices after 1 second. This 1-second pause helps prevent overloading the server.

Currently, we haven’t defined the get_products_to_be_processed and increase_prices functions.

How it works

  1. When you’re ready to start the process, you call start_process(). This function first checks the pj_price_increase_completed option to see if the process is already done. We don’t want to accidentally increase prices twice! If the process isn’t completed, it calls the function run_single_step() where the main action happens.
  2. run_single_step() first fetches the products to process. The get_products_to_be_processed function returns the next 20 products that need to be processed. It doesn’t return duplicate products because when we process a product, we mark it as processed by saving a meta field in the product’s metadata.
  3. If get_products_to_be_processed doesn’t return any products, it means we’ve processed all products. We then mark the process as completed by updating the pj_price_increase_completed option.
  4. If get_products_to_be_processed does return products, we process them (increase their prices) and then schedule pj_increase_prices hook to be called again after 1 second. pj_increase_prices would call run_single_step again, creating a loop. The loop breaks (exits) when there are no more products left to process.

Here’s how the increase_prices function might look:

	/**
	 * Increase prices.
	 *
	 * @param array $products Array of Products.
	 */
	public function increase_prices( $products ) {
		foreach ( $products as $product ) {
			$regular_price = $product->get_regular_price();
			$product->set_regular_price( floatval( $regular_price ) * 1.1 );

			// Update this meta so we know this product has been proccessed.
			$product->update_meta_data( 'pj_price_updated', true );
			$product->save();
		}
	}

After increasing the price of a product, we update its meta key pj_price_updated to mark it as processed. We use this meta key when fetching products to process in the get_products_to_be_processed function. We simply specify in the post query argument not to return products where the meta_key pj_price_updated exists.

	/**
	 * Get products which are not yet updated.
	 */
	public function get_products_to_be_processed() {
		$args = array(
			'post_type'      => 'product',
			'posts_per_page' => 20,
			'meta_query'     => array(
				array(
					'key'     => 'pj_price_updated',
					'compare' => 'NOT EXISTS',
				),
			),
		);

		$posts = get_posts( $args );

		if ( empty( $posts ) ) {
			return false;
		}

		$products = array();
		foreach ( $posts as $post ) {
			$products [] = wc_get_product( $post->ID );
		}

		return $products;
	}

Putting it all together

Here’s the complete class with all the functions defined:

class PJ_Product_Price_Updater {

	/**
	 * Construct.
	 */
	public function __construct() {
		add_action( 'pj_increase_prices', array( $this, 'run_single_step' ) );
	}

	/**
	 * Call this function to start the data processing. This will be starting point.
	 */
	public function start_process() {
		if ( get_option( 'pj_price_increase_completed' ) ) {
			// Process has completed, dont start again.
			return;
		}

		$this->run_single_step();
	}

	/**
	 * Run step.
	 */
	public function run_single_step() {
		$products = $this->get_products_to_be_processed();

		if ( empty( $products ) ) {
			// There are not products lefts for processing.
			update_option( 'pj_price_increase_completed', true );
		} else {
			$this->increase_prices( $products );
			// `pj_perform_data_migration` hook will keep calling this function until
			// all the products have been updated.
			as_schedule_single_action( time() + 1, 'pj_increase_prices' );
		}
	}


	/**
	 * Get products which are not yet updated.
	 */
	public function get_products_to_be_processed() {
		$args = array(
			'post_type'      => 'product',
			'posts_per_page' => 20,
			'meta_query'     => array(
				array(
					'key'     => 'pj_price_updated',
					'compare' => 'NOT EXISTS',
				),
			),
		);

		$posts = get_posts( $args );

		if ( empty( $posts ) ) {
			return false;
		}

		$products = array();
		foreach ( $posts as $post ) {
			$products [] = wc_get_product( $post->ID );
		}

		return $products;
	}

	/**
	 * Increase prices.
	 *
	 * @param array $products Array of Products.
	 */
	public function increase_prices( $products ) {
		foreach ( $products as $product ) {
			$regular_price = $product->get_regular_price();
			$product->set_regular_price( floatval( $regular_price ) * 1.1 );

			// Update this meta so we know this product has been proccessed.
			$product->update_meta_data( 'pj_price_updated', true );
			$product->save();
		}
	}
}


add_action(
	'init',
	function() {
		$updater = new PJ_Product_Price_Updater();
		if ( isset( $_GET['start_price_updation'] ) ) {
			$updater->start_process();
			exit;
		}
	}
);

By using Action Scheduler, we can manage large tasks efficiently without overwhelming our server. It allows us to break down a huge task into smaller, manageable chunks and process them one by one. In our example, we used it to update the prices of thousands of products without causing any server issues. This approach can be applied to any large task that needs to be processed in smaller parts.

Up Next:

How to use % character within sprintf in PHP

How to use % character within sprintf in PHP