CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Issues related to the CMS tools of WYSIWYG Web Builder.
Forum rules
PLEASE READ THE FORUM RULES BEFORE YOU POST:
viewtopic.php?f=12&t=1901

MUST READ:
http://www.wysiwygwebbuilder.com/cms_tools.html
A lot of information about the Content Manager System can be found in the help/manual. Please read this first before posting any questions! Also check out the demo template that is include with the software.

CMS trouble shooting / FAQ:
viewtopic.php?f=10&t=43245
Post Reply
bigdenis
 
 
Posts: 17
Joined: Sat Mar 02, 2019 11:09 am

CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis »

Dear Pablo!

I'm from Vojvodina in North of Serbia where are many spoken languages, therefore sites are multilingual. I have difficulties with CMS Search. Everything seems to be ok but the CMS Search is working strange. I'm illustrating this with your CMS demo project. There is no custom code. I turned on unicode support for CMS Admin and CMS View. The editor is the newest CKEditor 4. Database collation is utf8_general_ci, charset is utf8.

http://www.ntesla.edu.rs/cmseredeti/index.html

Strange behavior with Hungarian language: search can find some of the words but not every word
Strange behavior with Serbian Cyrillic language: search can't find any words
Strange behavior with Serbian Latin language: search can find some of the words but not every word

What is your advice?

I saw your CMS demo at http://www.wysiwygwebbuilder.com/suppor ... php?page=4 and there is everything ok with CMS Search in Cyrillic article (CMS Search is finding EVERY word from article). The difference is that there is a belorussian language and therefore maybe the database configuration is different?

I have completed a new web site with CMS tools and noticed the above mentioned things.
http://www.ntesla.edu.rs/sr_intro.html
In this site I used CMS Tools in 3 places:
http://www.ntesla.edu.rs/prosveta/sr_dogadjaji.php
http://www.ntesla.edu.rs/informacije/sr_dok_skole.php
http://www.ntesla.edu.rs/informacije/sr ... abavke.php
(Of course with the other language too - with hungarian prefix hu_ in he page names)

I think the simplest way to find out the error is thru your CMS Demo project presented by me with almost nothing to changed in it. If you still want my project source based on your CMS Demo tell me and I will upload it somewhere.

Thanks in advance!
User avatar
Pablo
 
Posts: 21569
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo »

Are you sure the database is configured as UTF8/unicode?
Do you see the search words in the database?
bigdenis
 
 
Posts: 17
Joined: Sat Mar 02, 2019 11:09 am

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis »

According to the phpMyAdmin screenshot, the database is configured as utf-8:
Image

There are search words in the database on 3 languages...
Serbian cyrillic:
Image

Serbian Latin:
Image

Hungarian:
Image

The screenshot is not showing all of them of course, those would be bigger images...
(But the word "rnrnФранцуз" is not the real word. "Француз" IS the real word.
"rnrnsrbi" isn't a real word, but "Srbi" IS.)

Searching for words from the scrrenshots:
Serbian Cyrillic - word "индоевропских" is in the word list in the database, but the CMS Search can't find it in the article.
Serbian Latin - word "Mađarskoj" is in the word list in the database, but the CMS Search can't find it in the article.
Hungarian - word "anyanyelvűek" is in the word list in the database, but the CMS Search can't find it in the article.
User avatar
Pablo
 
Posts: 21569
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo »

I'm sorry, I don't think I can help you with this.
The CMS script may be incompatible with these languages. Although you are the first user that has reported issues with this.
bigdenis
 
 
Posts: 17
Joined: Sat Mar 02, 2019 11:09 am

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis »

Sorry to hear that. Search among the articles is important thing. In the meantime I figured out that CMS Search...
...on hungarian is working correctly with words WITHOUT the national characters éáűúőóüí
...on serbian latin is working correctly with words WITHOUT the national characters čćžđš
...on serbian cyrillic is not functional at all

I'm not an expert but I read a little and found this article on the internet among others with the same content:
https://mathiasbynens.be/notes/mysql-utf8mb4
It's about full unicode support and using utf8mb4 instead of utf8. I'm not sure but can you reconsider your CMS script to implement full unicode support for above mentioned languages if I ask you politely? It may be as option for developers to chose through development interface. (like existing checkbox/properties for unicode support in CMS Admin and CMS View).

Maybe the UTF-16 is the solution? I'm asking this because I have other difficulties with hungarian/serbian latin/serbian cyrillic in the tables when I'm importing content from CSV file.
In the case of pure table importing cyrillic text from utf-8 csv I'm seeing incorrect characters, but when I importing from utf-16 csv the content is appearing correctly - sadly at first attempt for editing it is changing to garbage graphic characters.
When I'm using your Responsive Data Table extension utf-8 csv cyrillic import is incorrent to, but utf-16 csv cyrillic export is ok and the characters are shown correctly. I will report this anomaly in the right place and category on this forum. I mentioned it here because of the same problem of character set coding.

Dear Pablo. Please help me to solve this/these problem(s). If not now then in the future releases. I'm going to make other multilingual projects with CMS tools.

Thanks in advance!
User avatar
Pablo
 
Posts: 21569
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo »

I think the CMS script is Unicode compliant. I have tested it with different Unicode languages.
The documentation you are referring is all database configuration related.
Unfortunately, I cannot help you with the configuration of the server.
bigdenis
 
 
Posts: 17
Joined: Sat Mar 02, 2019 11:09 am

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis »

Dear Pablo!

I'm accepting your verdict, but let me present some other moments.
Once more, I'm not expert and I'm not pretending to be. I'm trying only to use my own brain...
Before I wrote the previous post to you I tried some things about database configuration of course.

First attempt:
Instead of pure utf-8 charset and collation I create it with utf8mb4 charset and collation. In the same time I tried to correct your generated script also to support utf8mb4 via search/replace. Not succeded, probably I messed up something, you are the master with your scripts.

Second attempt:
Instead of pure utf-8 charset and collation I create it with utf16 charset and collation. In the same time I tried to correct your generated script also to support utf16 via search/replace. As a result I saw chinese characters, therefore not succeded, once again you are the master of your own scripts.

I'm very sad now...Anyway, thanks for your patience reading my posts...
User avatar
Pablo
 
Posts: 21569
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo »

To see if the script supports the characters, I have added the words "Mađarskoj" and "anyanyelvűek" to the test page:
http://www.wysiwygwebbuilder.com/suppor ... php?page=4

As you can see this seems to work correct, so this indicates that the script works for these languages.
bigdenis
 
 
Posts: 17
Joined: Sat Mar 02, 2019 11:09 am

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis »

Yes, thank you, you are right!! This is crystal clear now. Something with database configuration, but what...I'm going crazy...spent days to figure out.
Thanks again for giving me a fix point in further investigations! I'm fond of WYSIWYG Web Builder and I'm planning to be a long rider with it.
User avatar
NDV
 
 
Posts: 136
Joined: Sun May 19, 2019 8:27 pm
Location: Ukraine
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by NDV »

Hello! Bigdenis.
I have exactly the same problem. Did you find a problem with the Cyrillic alphabet?
If found, if not difficult, share a solution, рlease.
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
User avatar
BaconFries
 
 
Posts: 5324
Joined: Thu Aug 16, 2007 7:32 pm

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by BaconFries »

@NDV it is most unlikely that you will get a answer from the original poster as he was Last active:Tue Mar 05, 2019 9:13 pm
User avatar
Pablo
 
Posts: 21569
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo »

Please make sure Unicode support is enabled.
User avatar
NDV
 
 
Posts: 136
Joined: Sun May 19, 2019 8:27 pm
Location: Ukraine
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by NDV »

CMS Search
Cyrillic (Ukrainian, Russian): No results

CMS View
Enable Unicode support (Yes)

CMS admin
Search Index: true

MySQL php5.6, php7.3
utf8_unicode_ci: search works only in Latin, Cyrillic (Ukrainian, Russian) No results
utf8_general_ci: does not respond to the search
utf8mb4_general_ci: does not respond to the search
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
User avatar
Pablo
 
Posts: 21569
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo »

User avatar
NDV
 
 
Posts: 136
Joined: Sun May 19, 2019 8:27 pm
Location: Ukraine
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by NDV »

I watched it working in a demonstration, сyrillic search works.
The question is how to find the error.

Demonstration. This is just a picture, these are not source files and database dumps in which you can see and compare.
On the Russian forum. This question has been asked since 2017. No solution found.
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
User avatar
Pablo
 
Posts: 21569
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo »

- make sure you have the latest version of WWB.
- enable Unicode in all objects (cms admin, cms view).
- set the character set of the page to UTF8
- set the character set of the database to UTF8
User avatar
NDV
 
 
Posts: 136
Joined: Sun May 19, 2019 8:27 pm
Location: Ukraine
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by NDV »

WWB 15.2.3
Unicode is enabled in all objects (cms admin, cms view).
Set page encoding to UTF8
Installed database encoding on UTF8
Example:
utf8_unicode_ci.sql
utf8_general_ci.sql
https://www.dropbox.com/sh/fpchcq69cwku ... q4xFa?dl=0
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
User avatar
Pablo
 
Posts: 21569
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo »

All settings are correct and see the generated code is Unicode compliant.
So, I think something is wrong on the database side or PHP configuration.

Here is an export of my test database:
https://www.wysiwygwebbuilder.com/support/CMS_PAGES.zip
User avatar
NDV
 
 
Posts: 136
Joined: Sun May 19, 2019 8:27 pm
Location: Ukraine
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by NDV »

Tried different databases.
Debian v8 MySQL 5.5.62, php5.6
Debian v9 MariaDB 10.3, php7.3

A simple set of rules to create a database

Code: Select all

CREATE DATABASE cmsdb CHARACTER SET utf8 COLLATE utf8_unicode_ci;
GRANT ALL PRIVILEGES ON cmsdb.* TO 'user'@'localhost';
FLUSH PRIVILEGES;
Also created databases via phpmyadmin.
To check, deleted and made a new.
Base utf8_unicode_ci
CMS_SEARCH_WORDS entries are present.
So I can’t imagine. Broken brain :(

What is strange is that extensions Bootstrap Table and MySQL CRUD, database search works and displays data.

Image

Image
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
User avatar
NDV
 
 
Posts: 136
Joined: Sun May 19, 2019 8:27 pm
Location: Ukraine
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by NDV »

Thanks for the dump (CMS_PAGES). I will look for an error.
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
User avatar
NDV
 
 
Posts: 136
Joined: Sun May 19, 2019 8:27 pm
Location: Ukraine
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by NDV »

Transferred a test page from the home server to hosting (php7.2). The host provider said php settings are correct. Cyrillic, search no result.
CMS_PAGES: Cyrillic, search no result.
Brain explosion. :mrgreen:
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
User avatar
Pablo
 
Posts: 21569
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo »

Did you use my database?
User avatar
NDV
 
 
Posts: 136
Joined: Sun May 19, 2019 8:27 pm
Location: Ukraine
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by NDV »

I tried your base CMS_PAGES on my Debian server. Cyrillic, вut the search is "no result". Latin Search works, ок.
After, i transferred the pages to paid hosting.
Errors appeared on the hosting when exporting the database CMS_PAGES, php7.2. Booted with errors, but worked. like last time - cyrillic вut the search is "no result". Latin Search works, ок. After, i created a new database, and WWB independently created my own tables, but no result.

Weekends will be a lot of time. I will do CMS.
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
User avatar
NDV
 
 
Posts: 136
Joined: Sun May 19, 2019 8:27 pm
Location: Ukraine
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by NDV »

A familiar programmer helped
Invalid string
\W will not work Cyrillic utf-8.

Code: Select all

$word = preg_replace('/\W/', '', $word);
Replace, comment or delete out.
Deleted string. Search and Cyrillic works, ок!
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
User avatar
NDV
 
 
Posts: 136
Joined: Sun May 19, 2019 8:27 pm
Location: Ukraine
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by NDV »

Changed the line. Cyrillic search works, ok!

Code: Select all

$word = preg_replace('/\W/u', '', $word);
Or this code

Code: Select all

$word = preg_replace('/[^\w]/u', '', $word);
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
User avatar
Pablo
 
Posts: 21569
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo »

Thanks, I will investigate if this can be implemented in a future version.
bigdenis
 
 
Posts: 17
Joined: Sat Mar 02, 2019 11:09 am

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis »

I haven't figured out solution for this problem until today. In the meanwhile I made some CMS sites with this "half functional" search. I'm glad that NDV was so persistent and he has found the solution. I tested his solutions with corrected line in the script and I can say that the solution is perfectly working in CMS Search for Serbian Latin, Serbian Cyrillic and Hungarian languages too!
Pablo, please implement this solution in the future release! Thanks in advance.
User avatar
Pablo
 
Posts: 21569
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo »

The modification has now been implemented in the latest build (02/16/2020)
https://www.wysiwygwebbuilder.com/download.html
User avatar
NDV
 
 
Posts: 136
Joined: Sun May 19, 2019 8:27 pm
Location: Ukraine
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by NDV »

Thank you! :D
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
bigdenis
 
 
Posts: 17
Joined: Sat Mar 02, 2019 11:09 am

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis »

Thank you, Pablo!
Post Reply