Skip to content

BulkInsert

Alexey Lapin edited this page Jan 22, 2023 · 3 revisions

Public API

You can see the public API in the BulkInsertContract

Example

use Lapaliv\BulkUpsert\BulkInsert;
use Illuminate\Database\Eloquent\Collection;
use App\Models\User;

$users = new Collection([ /* ... */ ]);

app()->make(BulkInsert::class)
    ->chunk(100, function(Collection $collection): mixed {
        // You can change the collection here and return it if you want
        // or just read it
        // The callback can return 
        // - Collection<BulkModel> - the original collection will be replaced with new one
        // - void|null - the original collection won't be changed
    })

    // `setEvents` takes the names of the eloquent's events.
    // Only transmitted events will be fired.
    // Default is ['creating', 'created', 'saving', 'saved']
    ->setEvents([ /* ... */ ])

    // It's an alias for `setEvents([])`
    ->disableEvents()

    // You can provide column names which will select from the database
    // Default is ['*']
    ->select(['*'])

    ->onCreating(
        function(Collection $collection): mixed {
            // You can manage the collection before inserting
            // The callback can return:
            // - Collection<BulkModel> - the original collection will be replaced with new one
            // - void|null - the original collection won't be changed
        }
    )

    ->onCreated(
        function(Collection $collection): void {
            // You can get all the inserted rows in this callback
            // The collection won't contain ignored rows. If you need them please use `onSaved`
        }
    )

    ->onSaved(
        function(Collection $collection): void {
            // You will get all your inserted and ignored models here.
            // You can separate them:
            // $inserted = $collection->filter(
            //     fn(User $user) => $user->wasRecentlyCreated
            // );
            // $ignored = $collection->filter(
            //     fn(User $user) => $user->wasRecentlyCreated === false
            // );
        }
    )

    // The first params is name of the model. Use can provide it like string (User::class) or an object (new User()).
    // The second param is an array of model columns to identify the model in the collection. 
    // These columns will be used to search rows in the database.
    // The third param is the collection of rows. It could have any `iterable` type such as `array`, `Collection`, `Iterator`, etc.
    // The fourth param is bool. Set it to true if you want to ignore any conflicts.
    // The method throws an exception if you have conflicts in your database
    ->insert(User::class, ['email'], $users, false);

    // You can also use `insertOrIgnore` method. It won't throw any exceptions if you have conflicts
    // ->insertOrIgnore(User::class, ['email'], $users);

Benchmarks

Environment

PHP: 8.1.14

MySQL: 8.0.28

Without callbacks (equals without select)

Task

100 models + 100 times + chunk(100) + insert only (without select)

Code

You can see the feature which generates entities here

/** @var GenerateEntityCollectionTestFeature $feature */
$feature = $this->app->make(GenerateEntityCollectionTestFeature::class);
/** @var BulkInsert $bulkInsert */
$bulkInsert = $this->app
    ->make(BulkInsert::class)
    ->chunk(100);
//            ->onSaved(
//                fn(Collection $collection) => null
//            );

$times = [];
$numberOfQueries = [];
for ($i = 0; $i < 100; $i++) {
    $entities = $feature->handle(MySqlEntityWithAutoIncrement::class, 100);
    
    DB::connection('mysql')->enableQueryLog();
    $start = microtime(true);
    
    $bulkInsert->insertOrIgnore(MySqlEntityWithAutoIncrement::class, ['uuid'], $entities);
    
    $times[] = microtime(true) - $start;
    $numberOfQueries[] = count(DB::connection('mysql')->getQueryLog());
    DB::connection('mysql')->flushQueryLog();
}

echo 'time:' . PHP_EOL;
echo 'median = ' . round(collect($times)->median(), 4) . PHP_EOL;
echo 'avg = ' . round(collect($times)->avg(), 4) . PHP_EOL;
echo 'min = ' . round(collect($times)->min(), 4) . PHP_EOL;
echo 'max = ' . round(collect($times)->max(), 4) . PHP_EOL;

echo 'number of queries:' . PHP_EOL;
echo 'median = ' . round(collect($numberOfQueries)->median(), 4) . PHP_EOL;
echo 'avg = ' . round(collect($numberOfQueries)->avg(), 4) . PHP_EOL;
echo 'min = ' . round(collect($numberOfQueries)->min(), 4) . PHP_EOL;
echo 'max = ' . round(collect($numberOfQueries)->max(), 4) . PHP_EOL;

Output

median = 0.0239
avg = 0.0251
min = 0.0222
max = 0.0639
number of queries:
median = 1
avg = 1
min = 1
max = 1

With callback (equals with select)

Task

100 models + 100 times + chunk(100) + select

Code

/** @var GenerateEntityCollectionTestFeature $feature */
$feature = $this->app->make(GenerateEntityCollectionTestFeature::class);
/** @var BulkInsert $bulkInsert */
$bulkInsert = $this->app
    ->make(BulkInsert::class)
    ->chunk(100)
    ->onSaved(
        fn(Collection $collection) => null
    );

$times = [];
$numberOfQueries = [];
for ($i = 0; $i < 100; $i++) {
    $entities = $feature->handle(MySqlEntityWithAutoIncrement::class, 100);

    DB::connection('mysql')->enableQueryLog();
    $start = microtime(true);

    $bulkInsert->insertOrIgnore(MySqlEntityWithAutoIncrement::class, ['uuid'], $entities);

    $times[] = microtime(true) - $start;
    $numberOfQueries[] = count(DB::connection('mysql')->getQueryLog());
    DB::connection('mysql')->flushQueryLog();
}

echo 'time:' . PHP_EOL;
echo 'median = ' . round(collect($times)->median(), 4) . PHP_EOL;
echo 'avg = ' . round(collect($times)->avg(), 4) . PHP_EOL;
echo 'min = ' . round(collect($times)->min(), 4) . PHP_EOL;
echo 'max = ' . round(collect($times)->max(), 4) . PHP_EOL;

echo 'number of queries:' . PHP_EOL;
echo 'median = ' . round(collect($numberOfQueries)->median(), 4) . PHP_EOL;
echo 'avg = ' . round(collect($numberOfQueries)->avg(), 4) . PHP_EOL;
echo 'min = ' . round(collect($numberOfQueries)->min(), 4) . PHP_EOL;
echo 'max = ' . round(collect($numberOfQueries)->max(), 4) . PHP_EOL;

Output

time:
median = 0.0348
avg = 0.0357
min = 0.0317
max = 0.068
number of queries:
median = 3
avg = 3
min = 3
max = 3