Fix improper charset handling in PY2 path of u(x)

Knowing fully that I may have just added another layer of impropriety, the above line fixed the charset errors I was getting. I'll illustrate this change with an example string `Charšet`, entered (e.g. through stdin) in UTF-8 encoding. To the best of my knowledge, the previous version would first have encoded this string to `Char\xc5\xa1et` (i.e., it encoded each byte not in ascii range as a hex escape code), and then have parsed this string to `CharÅ¡et` (i.e. after "r" it sees Unicode code point U+00c5 and U+00a1). My version simply takes this str for what it is: an UTF-8 representation of the unicode string `Charšet`.
2025-06-28 13:36:14 +02:00 · 2015-03-22 16:34:58 +01:00 · 2015-03-22 16:34:58 +01:00 · 354cc3244c
commit 354cc3244c
parent 1a65ae57cb
1 changed files with 1 additions and 6 deletions
--- a/jrnl/util.py
+++ b/jrnl/util.py
@ -67,12 +67,7 @@ def set_keychain(journal_name, password):

 def u(s):
    """Mock unicode function for python 2 and 3 compatibility."""
-    if PY3:
-        return str(s)
-    elif isinstance(s, basestring) and type(s) is not unicode:
-        return unicode(s.encode('string-escape'), "unicode_escape")
-    return unicode(s)
-
+    return s if PY3 or type(s) is unicode else s.decode("utf-8")

 def py2encode(s):
    """Encode in Python 2, but not in python 3."""