Sql – Oracle bug? SELECT returns no dupes, INSERT from SELECT has duplicate rows

oracleoracle11gplsqlsql

I'm getting some strange behaviour from an Oracle instance I'm working on. This is 11gR1 on Itanium, no RAC, nothing fancy. Ultimately I'm moving data from one Oracle instance to another in a data warehouse scenario.

I have a semi-complex view running over a DB link; 4 inner joins over large-ish tables and 5 left joins over mid-size tables.

Here's the problem: when I test the view in SQL Developer (or SQL*Plus) it seems fine, no duplication whatsoever. However, when I actually use the view to insert data into a table I get a large number of dupes.

EDIT: – The data is going into an empty table. All of the tables in the query are on the database link. The only thing passed into the query is a date (e.g. INSERT INTO target SELECT * FROM view WHERE view.datecol = dQueryDate) –

I've tried adding a ROW_NUMBER() function to the select statement, partitioned by the PK for the view. All rows come back numbered as 1. Again though, the same statement run as an insert generates the same dupes as before and now conveniently numbered. The number of duped rows is not the same per key. Some records exist 4 times some only exist once.

I find this to behaviour to be extremely perplexing. 🙂 It reminds me of working with Teradata where you have SET tables (unique rows only) and MULTISET tables (duplicate rows allowed) but Oracle has no such functionality.

A select that returns rows to the client should behave identically to one that inserts those rows to another location. I can't imagine a legitimate reason for this to happen, but maybe I'm suffering from a failure of imagination. 😉

I wonder if anyone else has experienced this or if it's a bug on this platform.

SOLUTION

Thanks to @Gary, I was able to get to the bottom of this by using "EXPLAIN PLAN FOR {my query};" and "SELECT * FROM TABLE(dbms_xplan.display);". The explain that actually gets used for the INSERT is very different from the SELECT.

For the SELECT most of the plan operations are 'TABLE ACCESS BY INDEX ROWID' and 'INDEX UNIQUE SCAN'. The 'Predicate Information' block contains all of the joins and filters from the query. At the end it says "Note – fully remote statement".

For the INSERT there is no reference to the indexes. The 'Predicate Information' block is just three lines and a new 'Remote SQL' block shows 9 small SQL statements.

The database has split my query into 9 subqueries and then attempts to join them locally. By running the smaller selects I've located the source of the duplicates.

I believe this is bug in the Oracle compiler around remote links. It creates logical flaws when re-writing the SQL. Basically the compiler is not properly applying the WHERE clause. I was just testing it and gave it an IN list of 5 keys to bring back. SELECT brings back 5 rows. INSERT puts 77,000+ rows into the target and totally ignores the IN list.

{Still looking for a way to force the correct behaviour, I may have to ask for the view to be created on the remote database although that is not ideal from a development viewpoint. I'll edit this when I've got it working…}

Best Answer

It seems to be Oracle Bug, we have found this following workarround: If you want that your "insert into select ..." work like your "select ...", you can pack your select in a sub select.

For example :

select x,y,z from table1, table2, where ...

--> no duplicate

insert into example_table
select x,y,z from table1, table2, where ...

--> duplicate error

insert into example_table
select * from (
       select x,y,z from table1, table2, where ...
)

--> no duplicate

Regards

Related Solutions

Sql – Insert into … values ( SELECT … FROM … )

Try:

INSERT INTO table1 ( column1 )
SELECT  col1
FROM    table2

This is standard ANSI SQL and should work on any DBMS

It definitely works for:

Oracle
MS SQL Server
MySQL
Postgres
SQLite v3
Teradata
DB2
Sybase
Vertica
HSQLDB
H2
AWS RedShift
SAP HANA
Google Spanner

Mysql – concatenate multiple MySQL rows into one field

You can use GROUP_CONCAT:

SELECT person_id,
   GROUP_CONCAT(hobbies SEPARATOR ', ')
FROM peoples_hobbies
GROUP BY person_id;

As Ludwig stated in his comment, you can add the DISTINCT operator to avoid duplicates:

SELECT person_id,
   GROUP_CONCAT(DISTINCT hobbies SEPARATOR ', ')
FROM peoples_hobbies
GROUP BY person_id;

As Jan stated in their comment, you can also sort the values before imploding it using ORDER BY:

SELECT person_id, 
       GROUP_CONCAT(hobbies ORDER BY hobbies ASC SEPARATOR ', ')
FROM peoples_hobbies
GROUP BY person_id;

As Dag stated in his comment, there is a 1024 byte limit on the result. To solve this, run this query before your query:

SET group_concat_max_len = 2048;

Of course, you can change 2048 according to your needs. To calculate and assign the value:

SET group_concat_max_len = CAST(
                     (SELECT SUM(LENGTH(hobbies)) + COUNT(*) * LENGTH(', ')
                           FROM peoples_hobbies
                           GROUP BY person_id) AS UNSIGNED);

Best Answer

Related Solutions

Sql – Insert into … values ( SELECT … FROM … )

Mysql – concatenate multiple MySQL rows into one field

Related Topic