Pages

Monday, March 26, 2018

Loading a long value into a String variable from a text file

The max length of a string seems to equal Integer.MAX_VALUE. However, a string literal length in Java is represented by two bytes implying that it cannot be over 65535 bytes. If you try to compile a class with a longer string, an error constant string too long will occur. Sometimes, for example for tests, one needs to use longer String values. Such values can be loaded from a file.

Suppose, a String variable has to be assigned the entire contents of a output.xml file, which contains ~400,000 characters and is saved in the classpath. Method inputStreamToString loads the contents from an InputStream into a String variable:

public class FileUtils {

    public String inputStreamToString(InputStream is) throws IOException {
        StringBuilder sb = new StringBuilder();
        char[] buffer = new char[1024 * 8];
        try (BufferedReader in = new BufferedReader(new InputStreamReader(is, StandardCharsets.UTF_8))) {
            int length;
            while ((length = in.read(buffer)) > -1) {
                sb.append(buffer, 0, length);
            }
        }
        return sb.toString();
    }
}

A test class:

public class FileUtilsTest {

    FileUtils i = new FileUtils();

    @Test
    public void testInputStreamToStringFromResourceFile() throws IOException {
        String resp = i.inputStreamToString(getClass().getResourceAsStream("/output.xml"));
        System.out.println(resp.length());// 358,830
        assertEquals(resp.length(), 358779);
    }

    @Test
    public void testInputStreamToStringFromString() throws IOException {
        System.out.println(Charset.defaultCharset()); // windows-1252
        String str = "this is a test string to be converted to InputStream";
        String copy = i.inputStreamToString(new ByteArrayInputStream(str.getBytes(StandardCharsets.UTF_8)));
        assertEquals(str.length(), copy.length());
    }
}

Another sorter possibility:

    public String readFileToString(String fileName) throws IOException {
        Path filePath = getPathInResources(fileName);
        return new String(Files.readAllBytes(filePath), StandardCharsets.UTF_8);
    }
How to convert a String to an InputStream
InputStream is=new ByteArrayInputStream(str.getBytes(StandardCharsets.UTF_8));

The code is used in the second test above.

How to save a String to a file
 Files.write(Paths.get("src\\test\\resources\\xml0.xml"),str.getBytes(StandardCharsets.UTF_8));

Wednesday, March 14, 2018

Changing font size and default language in SQL developer or Data Integrator

Changing the default tiny font size

By default the letters in SQL developer or Data integrator are hardly visible. The steps to increase the font size are quite the same on windows and linux. In %userprofile%/AppData on Windows or $HOME on Linux search for a file ide.properties. One file will be found in SQL developer folder, whereas in ODI folder two files will be found (No idea why because it does not matter).

In SQL developer's file uncomment a line with Ide.FontSize=18 and set the convienient font size (no less than 18).

Add the the same line to the two ODI files which seems identical on Linux, and different on Windows (the last two pictures).

Restart the applications.

Changing the default language to English

Java uses the default locale of the computer. To change the language, one needs to change JVM system variables. It can be done in a configuration file ide.conf located in the installation folder of SQL developer or ODI, e.g C:\oracle\Middleware\Oracle_Home. Add to lines the end of the file and then restart the application:

AddVMOption -Duser.language=en
AddVMOption -Duser.country=US

Thursday, March 8, 2018

Adding a datasource to Tomcat and specifying it in persistence.xml

To add a datasource connecting to Oracle database, save the Oracle driver ojdbc7.jar into CATALINA_HOME/lib. Then add a line with the connection details into CATALINA_HOME/conf/context.xml:

<Resource name="jdbc/saphirOracleDB" auth="Container" type="javax.sql.DataSource"
        maxTotal="20" maxIdle="30" maxWait="10000"
        username="username" password="password" driverClassName="oracle.jdbc.OracleDriver"
        url="jdbc:oracle:thin:@//hostname:1555/servicename"/>

In a sample persistence.xml depending on the created datasource, the reference to the datasource is obtained using JNDI name java:/comp/env/jdbc/saphirOracleDB

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.1" xmlns="http://xmlns.jcp.org/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd">
  <persistence-unit name="Saphir" transaction-type="RESOURCE_LOCAL">
    <provider>org.eclipse.persistence.jpa.PersistenceProvider</provider>
    <non-jta-data-source>java:/comp/env/jdbc/saphirOracleDB</non-jta-data-source>
    <class>entities.sqlresultmapping.DummyForMapping</class>
  </persistence-unit>
</persistence>

The entity manager can be obtained in the web application by using the specified persitence unit name:

    private static EntityManagerFactory emf = Persistence.createEntityManagerFactory("Saphir");
    private static EntityManager em = emf.createEntityManager();

Removing accents and other diacritical marks from unicode text so as to convert it into English letters

Often I need to convert unicode text, e.g. French, into English letters. The general way to remove diacritical marks is to decompose characters into chars representing letters and separately marks using Normalizer with form NFD, and then remove all chars holding diacritical signs using a regular expression \p{InCombiningDiacriticalMarks}+ matching the "Combining Diacritical Marks" unicode character block.

The sample class below uses as the input a meaningless text made up of french words with various accents:

public class Clean   {

    static void describe(String str) {
        System.out.println(str + " " + str.length());
    }

    public static void main(String[] args) {
        String str = "«J'ai levé la tête. Il doit être français». Il n'a pensé à lui ôter l'âge et se met à nager âgé.";
        describe(str);
        String normalizedString = Normalizer.normalize(str, Normalizer.Form.NFD);
        // the regexp corresponds to Character.UnicodeBlock.COMBINING_DIACRITICAL_MARKS
        String noDiacriticalMarks = normalizedString.replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
        describe(normalizedString);
        describe(noDiacriticalMarks);
    }
}

In the output the first line is the original string. The second is the same string but normalized. Note, the accents are stored as individual characters which are eliminated in the third line. Each line contains the length of the string.

«J'ai levé la tête. Il doit être français». Il n'a pensé à lui ôter l'âge et se met à nager âgé. 96
«J'ai levé la tête. Il doit être français». Il n'a pensé à lui ôter l'âge et se met à nager âgé. 107
«J'ai leve la tete. Il doit etre francais». Il n'a pense a lui oter l'age et se met a nager age. 96
Remove diacritical marks from a string in Javascript

Analogous approach in javascript taken from here:

const str = "Crème Brulée"
console.log(str + "=>" + str.normalize('NFD').replace(/[\u0300-\u036f]/g, ""));