The problem we have is that we're trying to store JSON provided by the user. Mea...

MBCook · on Dec 14, 2016

See that's the best use case I've seen in this entire discussion. If you don't get to control if blank vs null is significant then I can see how you'd have a real issue on your hand.

I'm guessing you have to have it in a form you can search it (so you can't just GZIP it or something like that)?

idbehold · on Dec 14, 2016

This is all correct.

richardwhiuk · on Dec 13, 2016

Append all strings with a zero width space, and remove it on the way out.

mikeash · on Dec 13, 2016

Using a printable character (or sequence of characters) seems like it would be a lot better. It'll ugly up your database, but at least it'll be obvious that there's a difference. Having all your string possess two forms which are visually identical but are not actually the same sounds unpleasant.

mrob · on Dec 13, 2016

I think it would be better to use something both printable and commonly used for placeholders, eg. an underscore, so it's obvious if you forget to remove it and unlikely to seriously confuse anybody.

networked · on Dec 13, 2016

What if the user has a string ending with a meaningful zero width space already stored? For example, the string could be checksummed somewhere. It would corrupt their data.

If you want a kludge for this, it's better to generate a longish random string (e.g., a UUID) to indicate an empty value.

eriknstr · on Dec 13, 2016

When you get a string from the client you prepend a single zero width space. When you send a string back you strip the single leading space you added. The client will always have the exact same data back that they sent originally.

networked · on Dec 13, 2016

You're right, of course. Sorry, I wasn't clear. I meant that a user might have stored a string with a zero width space at the end by the time you introduce this escaping mechanism. (I've already edited the comment to indicate this.) The same goes double if you append a common printable character. You'd have to rely on some additional indicator, such as the date and time the record with the string was stored, to know whether to unescape a string and also be sure nothing changed those date and time without escaping the data.

eriknstr · on Dec 13, 2016

Oh, yeah, in that case I'd either if-case it by time stamp or I'd prepend a zero width space to all historical data as well. I would prefer doing the latter and would only do the former if there was some reason I couldn't do the latter, for example if I had too much historical data to able to process it (though I have a hard time imagining that happening for something so trivial as prepending a zero width space, unlike say converting thousands of hours of video which might actually be too time-consuming or computationally expensive).

One issue that might arise with altering historical data that I can imagine would be if it was ever necessary to restore from backup and your backup was made before you later added the zero width space, and then you forget to add the zero width space again when you restore from backup a few months down the road. But with proper documentation and procedures that shouldn't happen.

gamegoblin · on Dec 13, 2016

I believe the parent is saying

    string_to_store = userstring + extra space
    dynamo.store(key, string_to_store)

    ...

    stored_string = dynamo.retrieve(key)
    user_string = stored_string - extra space

That way the user puts a string in and gets the same string out. No problem.

hehheh · on Dec 13, 2016

Is the zero width space destined to be the next maligned value, comparable to null?

idbehold · on Dec 13, 2016

It's too bad DynamoDB can't just do this conversion for me.

rjdavis3 · on Dec 13, 2016

While not ideal, you can create a new AttributeTransform that sets a place holder when storing into DynamoDB and removes it when pulling out of DynamoDB as part of your DynamoDBMapper instantiation.

I did this to convert some String Sets (SS) in my database to String Lists (L). I almost did this same thing to fix the empty String issue but didn't have the time to implement it yet.

http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/ama...

idbehold · on Dec 14, 2016

Unfortunately I'm working in a node environment. The Java SDK for working with DynamoDB seems much nicer to work with. For example I believe that you can do transactions with the DynamoDB interface for Java.

rjdavis3 · on Dec 14, 2016

Sorry I saw the link was referencing the Java SDK so I thought you were using the same. The DynamoDBMapper Java SDK has been an easy to use and readable ORM for me. Adding annotations to define keys and attributes has been great. That said I'm not sure if JavaScript has an equivalent to the AttributeTransformer I mentioned.