Hello,

I am trying to index a large amount of documents using the php client for elasticsearch. I have written a php script using RecursiveIteratorIterator to sort though the complex directory that I have and put it all into an array and then index it in Elasticsearch.

Here is the code:

<?php

require 'vendor/autoload.php';

$client = new Elasticsearch/Client();

$root = realpath('~/elkdata/for_elk_test_2014_11_24/Agencies');

$iter = new RecursiveIteratorIterator(
        new RecursiveDirectoryIterator($root, RecursiveDirectoryIterator::SKIP_DOTS),
        RecursiveIteratorIterator::SELF_FIRST,
        RecursiveIteratorIterator::CATCH_GET_CHILD);

$paths = array($root);
foreach ($iter as $path => $dir) {
    if ($dir -> isDir()) {
        $paths[] = $path;
        }
    }

//Create the index and mappings
$mapping['index'] = 'rvuehistoricaldocuments2009-2013'; //mapping code
$mapping['body'] = array (
    'mappings' => array (
        'documents' => array (
            '_source' => array (
                'enabled' => true
            ),
            'properties' => array(
                'doc_name' => array(
                    'type' => 'string',
                    'analyzer' => 'standard'
                ),
                'description' => array(
                    'type' => 'string'
                )
            )
        )
    )
);

$client ->indices()->create($mapping)

//Now index the documents

for ($i = 0; $i <= count($paths); $i++) {
    $params ['body'][] = array(
        'index' => array(
        'type' => 'documents'
        'body' => array(
            'foo' => 'bar' //Document body goes here

            )
        )
    );

    //Every 1000 documents stop and send the bulk request.

    if($1 % 1000) {
        $responses = $client->bulk($params);

    // erase the old bulk request
    $params = array();

    // unset the bulk response when you are done to save memory
    unset($responses);
    }
}
?>

I just want to know if this looks right. Thanks