Elasticsearch Cloud + Java REST Client Fuzzy Search

Finally!

The new Elasticsearch JAVA REST client is here.

Let’s get right into it!

The Elastisearch JAVA REST client doesn’t seem to lend itself to dot notation a much as I’d prefer.

examples of dot notation:

 Object.method

or

Object.Field

I had to construct much of the search request using a handmade JSON string. I made use of the Apache HttpEntity to pass the constructed JSON to Elasticsearch. Since I took this approach, let’s look at how to first create the pure JSON query.

In the screenshot below, we have the Kibana search screen on the left, and on the right is the response. We perform a search by first and last name. Because we are using the fuzziness option we don’t have to properly spell the names. Below we left out a letter in the first name, and yet we are able to find a matching record, which you see on the right side of the image.

kibana_wordpress_screenshot

Now this next part is not difficult, it’s just a bit annoying. We need to construct this request string in Java. Our goal is a query to find a person named “Sean Wu”. Notice how we butcher the first name passed in and we are still able to locate the record, thanks to fuzzy matching!

String encodedBytes = Base64.getEncoder().encodeToString(“username:password”.getBytes());

Header[] headers = { new BasicHeader(HttpHeaders.CONTENT_TYPE, “application/json”), new BasicHeader(“Authorization”, “Basic ” + encodedBytes) };

RestClient r = RestClient .builder(new HttpHost(“url_to_elastic_cloud_endpoint”, 9243, “https”)) .setDefaultHeaders(headers).build();

 

///////////////////////////////////////////////
String firstname =”San”;

String lastname=”Wu”;

HttpEntity entity = new NStringEntity(

“{“+

“\”query\”: {“+

” \”bool\”: {“+

“\”must\”: [“+

“{“+          “\”match\”: {“+

“\”firstname\”: {“+

“\”query\”: \””+

firstname+”\”,”+

“\”fuzziness\”: \”AUTO\””+

“}”+

“}”+        “},”+

“{“+

“\”match\”: {“+

“\”lastname\”: {“+

“\”query\”: \””+

lastname+”\”,”+

“\”fuzziness\”: \”AUTO\””+

“}”+

“}”+

“}”+

“]”+

“}”+

“}”+

“}”);

////////////////////////////////////////

Map<String, String> paramMap = new HashMap<String, String>();

paramMap.put(“pretty”, “true”);

////////////////////////////////////////
Response response = r.performRequest(“GET”, “/*/_search”, paramMap,entity);

///////////////////////////////

System.out.println(EntityUtils.toString(response.getEntity()));

System.out.println(“Host -” + response.getHost());

System.out.println(“RequestLine -” + response.getRequestLine()); System.out.println(response.toString());

///////////////////////////////

try {

r.close();

} catch (IOException e) {

e.printStackTrace();

}

}

There you have it!

Advertisements

Web Architecture: Setting Up For Success

Once upon a time there was a non-profit organization that wanted to make great improvements to their website. They wanted their website to become more dynamic, interactive and provide tailored content!

Let’s call this non-profit organization ABC! Company ABC wanted to create an account home section for their members when they login to the website. This landing page would be called MyABC and has items like:

  • review & edit profile
  • print member certificate

As you can read, these are basic necessities for a home account. Let’s push this account home further and provide a better user experience for our members.

New Idea! ADMIN users will help to improve the MyABC user experience. We will allow ADMIN users to tag the pages in the website so we can have categorized content. This way content can be tailored for each user. For example, user A specializes in Cancer Research. ADMIN users would tag pages with cancer related content using keywords like “cancer” and “cancer research”. We can then place links to these pages for said user, providing a custom experience.

Even better, by tagging and categorizing these pages, we can then feed that category information into our search engine. In my case, I am using Elasticsearch. We can force users to specific content or just make suggestions. For example:

Here is some sample JSON data about a page on the ABC website:

{
    "title": "Testing Suggestions",
    "description": "Use this to test...",
    "tags": [
        "test",
        "testing"
    ]
}

So let’s take a quick inventory so far:

  • JSON data with categories
  • Store JSON data into Elasticsearch
  • use categories to for more precise search

Here’s a screen shot of what we can end up with:

Blog-WebSuccess-searchBox

We type the category words we used when storing the data (“test”, “testing”), using JQuery autocomplete, we make an ajax call to Elasticsearch and return the titles of the document(s) matching the keywords!

This figure is a summation of how the data has been stored:

Blog-WebSuccess-tags

How does this help us in the long run? We get the capability to do the following:

  • Know traffic volume, since we can restrict users to pages available in our search engine, matching to specific keywords!!!
  • Categorized content can appear in the user’s MyABC (account home) page.
  • Send these clicks and page visits to Google Analytics for further analysis.

This was a very high-level write-up. We will cover more details in later posts. The point is to show how categorizing content, whether early or late in the website building process, can earn big gains on the SEO side of things!

Elasticsearch, cut back on your JavaScript

One of my favorite things about Elasticsearch is how it fits into the technology ecosystem. Elasticsearch has proven to be very helpful in not only accomplishing a great search, but also it helps to alleviate some of the work when building a new site/web app.

Take a JavaScript based search for example. My shop uses AngularJS, and we love it. I just wish our front end developer didn’t have to tediously handle data categories for each new set of data we get. That means copy-paste, parsing strings, worrying about commas, escaping characters.

This is done for every piece of data that is different and new.

With Elasticsearch getting a search up and running can be fast and efficient. Instead of parsing strings and escaping characters, we get right into actually categorizing the data.

Say I have some shoe products I’d like to create a search for. First thing is to get my data into a JSON format, Elasticsearch likes to be fed JSON.

About that JSON, it has categories. Elasticsearch likes categories, so much that it has its own name for them. They’re called “Types”.

So we have shoes:

ID Product_Name Type Price
1 Moon Walkers Boot $50
2 Air Jordans Running $100

 

Here is the first row from the above table in JSON:

{

“ID”: “1,”,

“Product_Name”: “Moon Walkers”,

“Type”: “Boot”,

“Price”: “$50.00”

}

 

Now that the data is stored in Elasticsearch, let’s dig a little deeper as to how Elasticsearch can speed up search creation.

Elasticsearch mostly deals with Types when concerned with data. Instead of a heavy amount of JavaScript, you can send an AJAX request to Elasticsearch for particular type(s). Then, in javascript, your only concern would be filtering the returned text values using a text box!

Voila!

Ok, I could show a bit more on how to actually do this!

In Elasticsearch, commands are JSON strings. Say we wanted to get all shoes with the type “Boot”. We’d do something like this:

GET /_search

{

“query”: {

“type” : {

“value” : “Boot”

}

}

}

 

Imagine a list if checkboxes, of shoe types, “Boot”, “Running”, etc… When a user clicks a check box, we make an ajax call to Elasticsearch, getting data by type. The returned data is then filtered in a JavaScript text box, in our case, AngularJS.

 

checkboxes

 

So, your JavaScript and Elasticsearch can work in tandem.

  • Elasticsearch gets your data
  • JavaScript filters the data by text

Rolling out a new search this way is much faster than pure JavaScript on the searching and filtering.

Java on Linux with multiple JDKs (Old, 2010)

Recently one of our linux servers at work crashed and died. I decided this time to skip the packages that come with the system and just come up with a more flexible way of compiling and running Java code.

May as well have multiple versions of Java. At the time I was coding in 1.5 so I needed a jdk5. But as I moved into the later testing phases of an application, I wanted to try some of the applications monitoring features in Java 6. So, to cover multiple bases, multiple JDKs.

First you want to download, I assume for all of us here, Java 5 and Java 6, jdk, sdk, the whole kit and caboodle!

Since the is all about options I like to:
1. Install the different versions of Java in /opt at the root level. You should be able to do this by typing ./java-file-you-downloaded.bin

2. Write a shell script that will set the Java version and location at run time. Just add, if using bash shell, export JAVA_HOME=/opt/jdkversion, and other path and classpath exports you need.

Using bash shell:

#! /bin/bash

export JAVA_HOME=/opt/jdk1.5.0_18
export PATH=$JAVA_HOME/bin:$PATH

3. Add to the shell script a call to your Java class.
java -classpath .:./moreDir:./anotherDir package.javaclass.class or no .class depending on your jdk

Now you can write shell scripts to compile, and run your code with different versions of Java just by creating a shell script!

The Beauty of Java Hashtable (Old, 2010)

Sometimes newbies can’t quite grasp how to use Java to get data from a database in an easy way.
A simple hashtable goes a long way in web development. for instance:

String sql = “select name,age,dob from employee”;

//notice the use of String key and Object value, since data from the database comes in many forms:Decimal,varchar,boolean
Hashtable<String,Object><string,object><string,object><string,object> employees = new Hashtable<string,object><string,object><string,object><String,Object><string,object>();

int count 0;

ResultSet rs = statement.executeQuery();
while(rs.next()){

employees.put(“name”+count,rs.getString(“name”));
employees.put(“age”+count,rs.getString(“age”));
employees.put(“dob”+count,rs.getString(“dob”));
count++;
}
This simple routine will give the novice developer a new weapon to his her arsenal.
Here we have a generic way store database data in an array.
If the data was all varchar, which is Java String, we could have created a hashtable with a signature of <String,String><string,string>. However, <String,Object>String,Object> gives the flexibility we need.

Don’t forget to increment that counter!

Catch The Cloud (Old, 2010)

Cloud Computing is slowly establishing itself as the backbone of large corporations, startup companies and independents all alike. The timeline of online technology as it descended into the dotcom bomb and arose from the ashes like a phoenix is a classic store of nothing to something.

In the beginning there was the Internet with new websites popping up daily. Can you recall all the early ‘Social Media’ sites, like ubo.net? It had everything I wanted but at the same time nothing. There were links to things I knew people like me were interested; the local event, restaurants, music, etc… The problem with these companies is that there was no true value in their web presence. Think about having a staff of information technology specialists, marketing department and the whole nine yards, but you have no true stream of revenue. That is what many companies faced in the mid to late 90s. No bottom line, hence, the dotcom bomb destroyed companies these companies.

Fast-forward to 2010 and businesses have a better understanding of technology and how it does more than act as a showpiece. In the financial services industry we have Cloud Computing serving as virtual work staff. Salesforce.com, one of, if not the largest, Customer Relationship Management (CRM) platform and has grown to over a billion dollars in assets in less than 10 years reaching into all facets of business like Lead Management for E-Trade Financial and global collaboration with business partners for Dell Computers. Cloud Computing allows smaller entities to live a better existence in their industry with Cloud based services. The cost of Cloud based services is dwarfed when compared to keeping an in house staff of information technology professionals. This is news golden for startups and independents.

ElasticSearch – Taking The NoSQL Plunge

I cannot tell a lie!

I have fallen in love with ElasticSearch. The ELK (ElasticSearch, Logstash & Kibana) Stack to be exact. Why?

UI: As I developer spending a lot of time on the back end and the middle end, we don’t think about how were going to get that sexy user interface up and running. In my case, the team is small, we have one UI developer, so time is of the essence. Out the box, Kibana provides a fantastic user interface. The UI is mostly mostly point and click. Think Google Analytics with statistical functions built into the UI for you to manipulate.

NoSQL: Never been interested in a game of darts, but ElasticSearch is the ultimate “dart board”. I can throw any record, with any fields without having to worry about matching table structure. Sure, we could right apps that database alter tables, but this is A.D. not B.C!

Data Injection: I’m a sucker for CSV files. Simple and convenient. But an SQL statement to get the same data is even better. Logstash allows you inject your data into ElasticSearch with ease, via different methods, like the two aforementioned.