I'm not sure working in parallel is always a good decision.
An anecdote: just the other day I've had to implement batching instead of concurrent parallel processing because PostgreSQL really hated me having thousands of concurrent transactions on the same table.
My particular workflow was essentially this - I get a batch (full state dump) with some products, and I need to update my `products` table to keep track of them (soft-deleting what had disappeared, inserting new, updating existing):
BEGIN;
-- Quickly load the batch into a temporary table
CREATE TEMPORARY TABLE products_tmp (LIKE products INCLUDING ALL) ON COMMIT DROP;
COPY products_tmp FROM STDIN;
-- Soft-delete products missing from the current batch
UPDATE products SET is_active = FALSE WHERE is_active AND store_id = ANY($1) AND id NOT IN (SELECT id FROM products_tmp WHERE store_id = ANY($1));
-- Upsert products from the current batch (add new, update existing)
INSERT INTO products (...) SELECT ... FROM products_tmp ON CONFLICT (id) DO UPDATE SET ...;
COMMIT;
With just a few thousands of concurrent writers things started to look quite ugly, with constant serialization failures (I started at SERIALIZABLE, then downgraded to REPEATABLE READ, was reluctant to use READ COMMITTED) and deadlocks preventing me from performing some DDL (schema migrations) on the products table.
So I've started to batch those batches elsewhere and dump them at periodic intervals - and things started to look a lot better. Maybe that was a naive/bruteish approach and I should've done some parameter tweaking and/or fancy table partitioning or something else (idk) for congestion control instead, but at least it worked.