classification
Title: json module should issue warning about duplicate keys
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Zeturic, andrei.avk, bob.ippolito, corona10
Priority: normal Keywords:

Created on 2021-08-31 00:30 by Zeturic, last changed 2021-11-12 01:06 by andrei.avk.

Messages (4)
msg400678 - (view) Author: Kevin Mills (Zeturic) Date: 2021-08-31 00:30
The json module will allow the following without complaint:

import json
d1 = {1: "fromstring", "1": "fromnumber"}
string = json.dumps(d1)
print(string)
d2 = json.loads(string)
print(d2)

And it prints:

{"1": "fromstring", "1": "fromnumber"}
{'1': 'fromnumber'}

This would be extremely confusing to anyone who doesn't already know that JSON keys have to be strings. Not only does `d1 != d2` (which the documentation does mention as a possibility after a round trip through JSON), but `len(d1) != len(d2)` and `d1['1'] != d2['1']`, even though '1' is in both.

I suggest that if json.dump or json.dumps notices that it is producing a JSON document with duplicate keys, it should issue a warning. Similarly, if json.load or json.loads notices that it is reading a JSON document with duplicate keys, it should also issue a warning.
msg400696 - (view) Author: Kevin Mills (Zeturic) Date: 2021-08-31 07:08
Sorry to the people I'm pinging, but I just noticed the initial dictionary in my example code is wrong. I figured I should fix it before anybody tested it and got confused about it not matching up with my description of the results.

It should've been:

import json
d1 = {"1": "fromstring", 1: "fromnumber"}
string = json.dumps(d1)
print(string)
d2 = json.loads(string)
print(d2)
msg406181 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-11-12 00:14
In general this sounds reasonable; - but a couple of thoughts / comments:

- If you have a dict with mixed numbers in str format and in number format (i.e. ints as numbers and ints as strings in your case), you are creating problems in many potential places. The core of the problem is logically inconsistent keys rather than the step of conversion to JSON. So the most useful place for warning would be when adding a new key, but that wouldn't be practical.

- Even if something is to be done at conversion to JSON, it's not clear if it should be a warning (would that be enough when the conversion is a logical bug?), or it should be some kind of strict=True mode that raises a ValueError?
msg406182 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-11-12 01:06
Another good option would be to use typed dict like `mydict : dict[int,str] = {}`; and use typed values when populating the dict; this way a type checker will warn you of inconsistent key types.
History
Date User Action Args
2021-11-12 01:06:23andrei.avksetmessages: + msg406182
2021-11-12 00:14:55andrei.avksetnosy: + andrei.avk
messages: + msg406181
2021-08-31 07:08:50Zeturicsetmessages: + msg400696
2021-08-31 05:58:32corona10setnosy: + corona10
2021-08-31 01:36:47rhettingersetnosy: + bob.ippolito
2021-08-31 00:30:37Zeturiccreate