Reverse engineering Synology’s NoteStation for blog system

My Blogging Journey

Since high school, I have been tinkering with my blog in various ways: starting with ready-made blog sites, then messing around with systems like WordPress, and later writing blog systems in various languages…

In short, if I’m not tinkering, I’m on my way to tinker (shamefully, I haven’t written many blog posts…). However, after so much tinkering, there are always some dissatisfactions: I have encountered blog sites shutting down, dissatisfaction with WordPress themes and features, and the need to host the systems I write on a VPS…

Actually, summarizing the above problems, it’s nothing more than the following requirements:

Infrastructure needs to be stable and reliable
The blog system needs to be fully controllable
Writing needs to be convenient

Fortunately, after starting work, I got a Synology NAS. There’s a joke that Synology sells NAS hardware just to make friends, the real value is in the DSM system, I have to say DSM is indeed very easy to use, not only does it have a rich set of applications, but the support and ecosystem are also very well established.

At first, I was using the international version of Evernote, but Evernote became increasingly unfriendly to free users, forcing me to abandon it. Later, I tried OneNote, Youdao Cloud Notes, Wiz Notes, etc., and eventually gave up due to various problems, this situation is ended until I found DSM’s NoteStation.

Let’s first look at the advantages of NoteStation (whether it’s an advantage is judged based on my needs):

Excellent multi-platform support (especially for Linux and Android platforms)
Has a Chrome plugin similar to Evernote (this plugin can be used to quickly save web pages, it’s simply a re-posting artifact)
Private cloud (infrastructure is fully controllable)

Looking back at the above advantages and the requirements for the blog, you will find that the fit is actually very high. However, NoteStation has a problem: it is closed source and cannot be developed further.

I wanted to use NoteStation as the backend for blogging once, but because it is closed source, I never implemented the idea, until I saw this article ^{Reference 1}, I strongly recommend that you take a look at this article before continuing to read.

After reading the article ^{Reference 1} mentioned above, I basically understood the document organization method of NoteStation, and then it was a step-by-step implementation of the initial idea of “building a blog system with NoteStation”.

Encoding Problem

In the implementation process, I encountered a critical problem, let’s first look at the storage structure of the note:

The metatext.json, basic.json, etc. in the screenshot above are used to store some basic information about the note. What’s weird is the stuff in the version/text directory, the files in this directory are used to store the main content of the note, but what the hell is this {XX} format? And the whole filename part looks like some kind of special encoding, most importantly, this encoding should be a private encoding.

If you can’t understand this encoding, the subsequent blog backend code will not be able to be implemented “elegantly” - you can’t hard code it into the program, right? After finally figuring out the file organization structure of NoteStation, would I give up my idea of many years because of such a small point? Of course, the answer is already known to you, otherwise where would this blog post come from.

Solution to the Encoding Problem

The DSM system is essentially still a website, and all websites are nothing more than: WebServer + App mode, and NoteStation as an App of this “website” of DSM, naturally needs to have the service program of this App.

A little research on DSM will reveal that most of the Apps in DSM are implemented in cgi mode (why use cgi is probably because of the closed source requirement, php stuffs are equal to open source), DSM itself is a web service running on *nix system, NoteStation interacts with users in the form of cgi application.

Therefore, you can first look at the situation of the so file (*nix system executable program). At the beginning of the DSM installation, you need to specify an installation disk, in other words, all DSM applications will be installed on this disk (here for /volume2), first use the following command to see what so files are in the installation directory of NoteStation:

find ./ -type f -iname '*\.so*'

The result is as follows:

Through the directory, we can basically guess that the files under webapi may be mainly used to provide interaction with the user’s web page, and the files under lib directory mainly provide support for various basic functions.

But this is of no use - we need to know how to decode the mysterious encoding mentioned in the above text.

Actually, it’s not without a clue, at least we know that the program needs to process the files in the version/text directory, in other words version/text is our breakthrough.

Since Synology’s system lacks many basic commands, I packed the files and dragged them down together, the packing command is as in the last line of the above picture:

find ./ -type f -iname '*\.so*' -exec cp {} /volume1/Documents/Sos \;

Then use the following command to extract all visible strings from the so file, and filter out the strings containing text/:

strings -f * | grep -Pi 'text/'

The result is as follows:

Luckily, there are not many files. And the webapi and lib directories mentioned above are actually a bit useful, basically we only need to look at the libsynodrive.so.6.0 file, the other few files are actually related to HTTP text (and these few files are all in the webapi directory).

Take out the omnipotent IDA, shift+f12 to extract strings, locate to /text/* place:

There are two references, after observation, the reference at sub_583a0+9C is not what we need, now there is only one left, it’s simply awesome. Navigate to this call place, find that it is an assignment to pattern (line 171 in the picture below):

Then track the cross-reference of pattern and find that line 303 in the picture above is the key position, because v107 assigns to v129 through std::string::assign, and the SYNODriveDecode function at line 306 references the v129 variable, the SYNODriveDecode function looks like it can answer how we decrypt the mysterious encoding:

__int64 __fastcall SYNODriveDecode(const std::string *a1, unsigned __int8 *a2, size_t a3, char a4)
{
// [COLLAPSED LOCAL DECLARATIONS. PRESS KEYPAD CTRL-"+" TO EXPAND]
v4 = *(const char **)a1;
v5 = std::string::_Rep::_S_empty_rep_storage;
n[0] = a3;
v6 = *((_QWORD *)v4 - 3);
v27[0] = (__int64)off_2E7F50 + 24;
if ( !v6 )
{
// Omit error handling code
……
LABEL_3:
……
goto LABEL_4;
}
if ( a4 )
{
std::string::assign((std::string *)v27, a1);
v11 = 0LL;
LABEL_8:
bzero(a2, n[0]);
// SLIBCBase64Decode is a function packaged by DSM. It is guessed that it packages the base64 decoding function.
if ( (unsigned int)SLIBCBase64Decode(v27[0], *(_QWORD *)(v27[0] - 24), a2, n) )
{
v8 = 1;
}
else
{
// Omit error handling code
……
v8 = 0;
}
}
else
{
v11 = (char *)calloc(v6 + 1, 1uLL);
if ( !v11 )
{
// Omit error handling code
……
goto LABEL_3;
}
snprintf(v11, v6 + 1, "%s", v4);
v22 = &v25;
while ( 1 )
{
v13 = strchr(v11, '{');
v14 = v13;
// If there is not character '{' exists
if ( !v13 )
{
v21 = strlen(v11);
std::string::append((std::string *)v27, v11, v21);
goto LABEL_8;
}
*v13 = 0;
v15 = strlen(v11);
std::string::append((std::string *)v27, v11, v15);
*v14 = '{';
v16 = strchr(v14, '}');
// No more characters which is {xx} format
if ( !v16 )
break;
*v16 = 0;
v11 = v16 + 1;
v17 = strtol(v14 + 1, 0LL, 10);
*(v11 - 1) = '}';
std::string::string(v28, 1LL, (unsigned int)v17, &v25);
// Convert xx in {xx} to decimal using strtol
// The append the summoned integer as a character to then end of the string to be processed
std::string::append((std::string *)v27, (const std::string *)v28);
v18 = v28[0] - 24;
if ( (void *)(v28[0] - 24) != v5 )
{
// Omit error handling code
……
}
if ( !v11 )
goto LABEL_8;
}
syslog(3, "%s:%d Failed [%s], err=%m\n", "common/synodrive_common.cpp", 839LL, "NULL == szEnd");
SYNODriveErrAppendEx("common/synodrive_common.cpp", 839, "NULL == szEnd", (char)&v25);
v8 = 0;
}
LABEL_4:
free(v11);
// Omit error handling code
……
return v8;
}

Okay, the logic is basically clear: extract the string in {}, then convert it to the corresponding decimal number, finally replace the ASCII value of the decimal number as a character to the original position in the original text, and finally decode with the private function SLIBCBase64Decode.

About the SLIBCBase64Decode private function, I’m too lazy to find where it’s implemented, for now, just close my eyes and assume it’s the Base64Decode function (having a signature is awesome: Synology-LIBC-Base64Decode).

It’s still a bit abstract, take version/text/Y{110}J{112}ZWY{61} as an example:

ASCII(110) -> n
ASCII(112) -> p
ASCII(61) -> =

So version/text/Y{110}J{112}ZWY{61} is converted to: version/text/YnJpZWY=

Now it’s much more pleasing to the eye, everyone on earth knows that YnJpZWY= is base64:

It turns out to be the meaning of the introduction~

Coding Time

After the verification is okay, the next step is the simplest part.

From the above analysis, it is easy to get the code used for decoding (Typescript syntax):

const bs_filenameDecode = (filename : string) : string => {
let decoded_filename = '', current_position = 0;
while (true) {
const curly_start_position = filename.indexOf('{', current_position);
if (curly_start_position != -1) {
const curly_end_position = filename.indexOf('}', curly_start_position);
decoded_filename += filename.substring(current_position, curly_start_position);
decoded_filename += String.fromCharCode(
parseInt(filename.substring(curly_start_position + 1, curly_end_position))
);
if (curly_end_position + 1 === filename.length) break;
current_position = curly_end_position + 1;
}
}
return decoded_filename;
}

As for why the author of NoteStation would use such a strange encoding method, I can’t guess. But the libsynodrive.so.6.0 file also exposes a function called SYNODriveEncode, this function is used to encode a “mysterious encoding”, here I won’t post the IDA HexRay code of this function, just say the result: the function encodes the input string with base64, then converts all characters outside [A-Z0-9] into {corresponding ascii value} format.

The corresponding encoding logic is as follows (Typescript syntax):

const bs_filenameEncode = (filename : string) : string => Buffer.from(filename)
.toString('base64')
.split('')
.map(
c => /[A-Z0-9]/.test(c) ? c : `{${c.charCodeAt(0)}}`
).join('');

With bs_filenameDecode and bs_filenameEncode, you can happily read the introduction, main content and other data of various notes in NoteStation.

Postscript

Actually, the entire blog system has been completed, it’s just that because I am a late-stage lazy cancer patient, I only started to write something now.

In the engineering process of the entire blog, there are actually many security issues involved - you guys don’t think the backend is running directly on Synology’s NAS, right? Don’t think that the entire system is not isolated, right? Don’t think that any note can be viewed through the blog system, right? These related issues, if there is a chance, I will write something specifically to share with you.

Finally, here is a screenshot of the blog backend which I want to show off a little bit ;)

Reference

Record of various recovery methods for NoteStation log data corruption in Synology system

Watch & Learn

Debugwar Blog

Step in or Step over, this is a problem ...