29 Jan 2016
As you may be aware, the use of @Converters
allows for the seamless translation of JSON objects to and from domain objects. However, there is a limitation to this approach in that it prohibits the use of JSON functionalities in JPQL queries, forcing one to rely solely on native SQL queries. To overcome this limitation, a custom Hibernate dialect that supports JSONB type and functions can be implemented, thereby exposing them to the JPQL level.
To accomplish this, we can create a custom Hibernate dialect that is based on PostgisDialect
, as we plan to incorporate both PostGIS and JSON in our application. This can be achieved by overriding the registerTypesAndFunctions
function. The source code for this implementation can be found on GitHub.
Our objective is to achieve three key goals:
- the ability to have custom types backed by JSON to be available in JPA entities
- the ability to utilize
jsonb_*
functions in JPLQ queries
- the ability to easily add new types mapped to JSONB columns with minimal coding.
By creating a custom Hibernate dialect, we can accomplish these goals, making our development process more efficient and streamlined.
Let’s start by creating a custom Hibernate dialect, basing it on PostgisDialect
, as we plan to use both PostGIS and JSON in our application. We need to override the registerTypesAndFunctions
function:
@Override
protected void registerTypesAndFunctions() {
super.registerTypesAndFunctions();
registerColumnType(JSONTypeDescriptor.INSTANCE.getSqlType(), "jsonb");
registerFunction("extract",
new StandardSQLFunction("jsonb_extract_path_text", StandardBasicTypes.STRING));
}
When registering a column type, we inform Hibernate on how to handle instances where the target column is of type jsonb
. The JSONTypeDescriptor
class registers the SQL type name and the converters that translate between the database object and the domain object, which are known as “binder” and “extractor”. The process is relatively simple, as JSONB is received as a PGobject
which is simply a tagged string.
By registering a function, we are able to use that function, such as jsonb_extract_path_text
, in our code. The StandardSQLFunction
class is utilized to explain how to translate the standard form of f(x, y, ...)
into plain SQL. This process is also relatively straightforward.
However, just having a JSON string is only half of the solution. We also need to translate between domain objects and JSON strings. To accomplish this, we use the Jackson library for its powerful data binding capabilities:
public class JSONBackedTypeDescriptor<T> extends AbstractTypeDescriptor<T> {
private static final Logger logger = LoggerFactory.getLogger(JSONBackedTypeDescriptor.class);
private static final ObjectMapper objectMapper = new ObjectMapper();
public JSONBackedTypeDescriptor(Class<T> type) {
super(type);
JavaTypeDescriptorRegistry.INSTANCE.addDescriptor(this);
}
@Override
public String toString(T value) {
try {
return objectMapper.writeValueAsString(value);
} catch (JsonProcessingException e) {
logger.warn("Cannot convert map {} to string", e);
return "{}";
}
}
@Override
public T fromString(String string) {
try {
return objectMapper.readValue(string, getJavaTypeClass());
} catch (IOException e) {
logger.warn("Cannot read value from {}", string, e);
return null;
}
}
@Override
public <X> X unwrap(T value, Class<X> type, WrapperOptions options) {
if (value == null) {
return null;
}
if (type.isAssignableFrom(value.getClass())) {
return type.cast(value);
}
if (String.class.isAssignableFrom(type)) {
return type.cast(toString(value));
}
throw unknownUnwrap(type);
}
@Override
public <X> T wrap(X value, WrapperOptions options) {
if (value == null) {
return null;
}
if (value.getClass().isAssignableFrom(getJavaTypeClass())) {
return getJavaTypeClass().cast(value);
}
if (value instanceof String) {
return fromString((String) value);
}
throw unknownWrap(value.getClass());
}
}
To ensure seamless integration, it is essential to link our two converters together.
Domain Object <-- JavaTypeDescriptor --> String <-- SqlTypeDescriptor --> JSONP
This includes creating a connection between the domain object, the JavaTypeDescriptor
, the string representation, the SqlTypeDescriptor
, and finally, the JSONP format. By creating an abstract class that facilitates this connection, we can ensure a smooth integration process:
public abstract class JSONBackedType<T> extends AbstractSingleColumnStandardBasicType<T> {
public JSONBackedType(JSONBackedTypeDescriptor<T> javaTypeDescriptor) {
super(JSONTypeDescriptor.INSTANCE, javaTypeDescriptor);
}
@Override
public String[] getRegistrationKeys() {
return new String[] { getJavaTypeDescriptor().getJavaTypeClass().getCanonicalName() };
}
@Override
public String getName() {
return getJavaTypeDescriptor().getJavaTypeClass().getName();
}
}
With all the necessary non-domain specific code in place, we can now proceed to create our entities and repositories. Let’s take a look at our Device
entity, defined as follows:
@Entity
@Table(name = "devices")
public class Device {
@Id
@Column(name = "id")
private String id;
@Column(name = "location", nullable = false, columnDefinition = "geometry(Point,4326)")
private Point location;
@Column(name = "status", nullable = false, columnDefinition = "jsonb")
@Type(type = "com.kartashov.postgis.types.StatusType")
private Status status = new Status();
...
}
The Device
entity is accompanied by an embedded Status
object, which is defined as follows:
public class Status {
private double stateOfCharge;
private String lifeCycle;
...
}
To link the Java and SQL type descriptors, we have defined the StatusType
class, which extends the JSONBackedType
class:
public class StatusType extends JSONBackedType<Status> {
private final static JSONBackedTypeDescriptor<Status> descriptor =
new JSONBackedTypeDescriptor<>(Status.class);
public StatusType() {
super(descriptor);
}
}
With our custom Hibernate dialect in place, we can now seamlessly integrate JSONB functionality into our JPQL queries. For instance, we can easily retrieve all devices with a state of charge greater than 10% by utilizing the following query within our DeviceRepository
interface:
public interface DeviceRepository extends CrudRepository<Device, String> {
@Query("SELECT d FROM Device d WHERE CAST(extract(d.status, 'stateOfCharge') float) > 0.1")
List<Device> findHealthyDevices();
...
}
This JPQL query will be translated into the following SQL query:
select device0_.id as id1_0_,
device0_.location as location2_0_,
device0_.status as status3_0_
from devices device0_
where cast(jsonb_extract_path_text(device0_.status, 'stateOfCharge') as float4) > 0.1
This implementation effectively achieves our goal of integrating JSONB functionality into our JPQL queries, allowing for easy and efficient data retrieval from our devices.
31 Dec 2015
PostGIS is a powerful PostgreSQL extension designed to work with spatial data, including points, lines, and polygons. To begin experimenting with this extension, it is often easiest to pull a Docker image with PostGIS already installed. The following command can be used to do so:
sudo apt-get install docker.io
sudo docker run --name postgis -e POSTGRES_PASSWORD=postgis -d mdillon/postgis:9.4
Once the image is pulled, you can access the psql
console by executing:
sudo docker exec -it postgis psql -U postgres
At the prompt, you can check the version of PostgreSQL/PostGIS by running the following command:
SELECT version();
SELECT postgis_full_version();
As Docker runs the database in a container, you will need the IP address if you wish to access PostgreSQL from outside the container. The following command can be used to retrieve the IP address:
sudo docker inspect postgis | grep IPAddress
ou can also use pgcli
, a user-friendly cli tool, to access the database:
pgcli -h 172.17.0.2 -U postgres
To further explore the capabilities of PostGIS, we can create a simple Spring Boot project. The source code for this project can be found on GitHub
Initliaze the application
spring.jpa.database-platform = org.hibernate.spatial.dialect.postgis.PostgisDialect
spring.jpa.show-sql = true
spring.jpa.hibernate.ddl-auto = update
spring.datasource.url = jdbc:postgresql://172.17.0.1:5432/postgres?user=postgres&password=postgis
spring.datasource.driver-class-name = org.postgresql.Driver
In this example, we will use a device entity that concerns itself primarily with the location of the device:
@Entity
@Table(name = "devices")
public class Device {
@Id
@Column(name = "id")
private String id;
@Column(name = "location", nullable = false, columnDefinition = "geometry(Point,4326)")
private Point location;
...
}
It is essential to define a custom type for the location attribute, so that Hibernate can link Java geometry types with SQL geometry types.
The Hibernate dialect allows us to send queries like the following through JPQL:
public interface DeviceRepository extends CrudRepository<Device, String> {
@Query("SELECT d FROM Device AS d WHERE within(d.location, :polygon) = TRUE")
List<Device> findWithinPolygon(@Param("polygon") Polygon polygon);
}
You can take a pick at the class org.hibernate.spatial.dialect.postgis.PostgisDialect
to see what functions are available on JPQL side and how they map to PostGIS functions
Now we can test the search functionality
Device device = new Device();
device.setId("de-001");
Point point = (Point) new WKTReader().read("POINT(5 5)");
device.setLocation(point);
deviceRepository.save(device);
Polygon polygon = (Polygon) new WKTReader().read("POLYGON((0 0,0 10,10 10,10 0,0 0))");
List<Device> devices = deviceRepository.findWithinPolygon(polygon);
System.out.println(devices);
27 Dec 2015
Docker is an incredibly useful tool that allows for easy experimentation with new software. It enables the simultaneous installation, suspension, and deletion of multiple versions of programs, such as Postgres, with minimal effort. The installation process is as simple as executing the following command:
sudo apt-get install docker.io
sudo docker run --name postgres -e POSTGRES_PASSWORD=postgres -d postgres:9.5
Once Docker is installed, you can access an interactive Postgres shell by executing the following command:
sudo docker exec -it postgres psql -U postgres
With this tool, we can now delve into the new JSON functionality available in Postgres 9.5. We will begin by creating two tables, devices and observations, and adding some data to them:
CREATE TABLE sensors (
id integer PRIMARY KEY,
type character varying(64) NOT NULL,
config jsonb NOT NULL
);
CREATE INDEX config_gin ON sensor USING gin(config);
INSERT INTO sensors (id, type, config)
VALUES (1, 'soil-moisture', '{ "alpha": 0.543, "beta": -2.312, "enabled": true }'),
(2, 'soil-temperature', '{ "enabled": true, "depth": 0.24 }'),
(3, 'humidity', '{ "enabled": false, "height": 1.34, "device": { "version": "3.4", "supported": true } }');
We can then enable the humidity sensor:
UPDATE sensors
SET config = config || '{ "enabled": true }'::jsonb
WHERE id = 3;
Remove the alpha
and the beta
parameters used for soil moisture calibration:
UPDATE sensors
SET config = config - 'alpha' - 'beta'
WHERE type = 'soil-moisture';
And now let’s remove the supported tag from all device sections:
UPDATE sensors
SET config = config #- '{device,supported}';
Fetch the device version information wherever it’s available:
WITH versioning AS (
SELECT id, type, config #>> '{device,version}' AS version
FROM sensors
)
SELECT *
FROM versioning
WHERE version IS NOT NULL;
Find all the sensors where the depth is specified:
SELECT *
FROM sensors
WHERE config ? 'depth';
Let’s find all properly placed sensors, where either the depth
or the height
are specified, but not both:
SELECT *
FROM sensors
WHERE config ?| array['depth', 'height']
AND NOT config ?& array['depth', 'height'];
For further reading on the JSON functionality available in Postgres 9.5, I recommend consulting the official documentation.
16 Nov 2015
The following table, referred to as “sensors”, displays sensor data in the format of an id
, location
, and status
. The status
field is in the form of a JSON object.
+------+------------+---------------------------------------+
| id | location | status |
|------+------------+---------------------------------------|
| 1 | backyard | {"healthy": false, "temperture":0.0} |
| 2 | frontyard | {"healthy": false, "temperture":12.7} |
| 3 | unknown | {"healthy": true, "temperture":28.1} |
| 4 | farm | {"healthy": false, "temperture":48.1, |
| | | "errorCode": 78} |
+------+------------+---------------------------------------+
We can extract a list of all possible status fields with the following query:
SELECT DISTINCT json_object_keys(status) AS status_field
FROM sensors
ORDER BY status_field;
which produces the following result:
+----------------+
| status_field |
|----------------|
| errorCode |
| healthy |
| temperture |
+----------------+
To determine the highest temperature recorded by any sensor, we can use the following query:
SELECT MAX((status->>'temperature')::text)
FROM sensors;
This query returns:
+-------+
| max |
|-------|
| 48.1 |
+-------+
However, if we wish to determine the highest temperature recorded by only healthy sensors, we can use the following query:
SELECT MAX((status->>'temperture')::text)
FROM sensors
WHERE (status->>'healthy')::bool
which produces the following result:
+-------+
| max |
|-------|
| 28.1 |
+-------+
Additionally, we can retrieve the average temperature of healthy sensors grouped by location with the following query:
SELECT location,
AVG((status->>'temperture')::float) AS temperature
FROM sensors
WHERE (status->>'healthy')::bool
GROUP BY location;
which produces the following result:
+------------+---------------+
| location | temperature |
|------------+---------------|
| unknown | 28.1 |
+------------+---------------+
We can further expand upon this data by using the functions json_each
, json_each_text
, and so on, to expand the JSON status object into a larger table. So in our next iteration, we can use the following SQL query to fetch the packed records:
SELECT id,
json_each_text(status) AS status_field
FROM sensors;
This query returns:
+------+-------------------+
| id | json_each_text |
|------+-------------------|
| 1 | (healthy,false) |
| 1 | (temperture,0.0) |
| 2 | (healthy,false) |
| 2 | (temperture,12.7) |
| 3 | (healthy,true) |
| 3 | (temperture,28.1) |
| 4 | (healthy,false) |
| 4 | (temperture,48.1) |
| 4 | (errorCode,78) |
+------+-------------------+
We will then proceed to unfold the status_field
record through the following query:
SELECT id,
(status).key,
(status).value
FROM (SELECT id,
json_each(status) AS status
FROM sensors) AS statuses;
This will yield the following table:
+------+------------+---------+
| id | key | value |
|------+------------+---------|
| 1 | healthy | false |
| 1 | temperture | 0.0 |
| 2 | healthy | false |
| 2 | temperture | 12.7 |
| 3 | healthy | true |
| 3 | temperture | 28.1 |
| 4 | healthy | false |
| 4 | temperture | 48.1 |
| 4 | errorCode | 78 |
+------+------------+---------+
Note that the third column has a type of text.
SELECT DISTINCT pg_typeof((status).key) AS key_type,
pg_typeof((status).value) AS value_type
FROM (SELECT id,
json_each_text(status) AS status
FROM sensors) AS statuses;
Which gives us
+------------+--------------+
| key_type | value_type |
|------------+--------------|
| text | text |
+------------+--------------+
It is important to note that when using json_each, the value_type should be json and not text. This is significant because it preserves the data type information (as limited as it may be in JSON) as opposed to casting it to text.
Additionally, it is worth mentioning that utilizing the pgcli tool while experimenting with Postgres proved to be extremely beneficial. Further information regarding composite types can be found in the Postgres documentation: http://www.postgresql.org/docs/9.4/static/rowtypes.html
12 Nov 2015
As database developers, we sometimes come across the challenge of storing data that doesn’t have a fixed schema, such as user preferences or dynamic telemetry readings. When the set of keys is constantly changing or unknown, it’s not possible to normalize the data. This is where “transposing” tables comes in - it’s a common solution where a table is created with the following kind of schema:
| sensor_id | location |
+-----------+-----------------+
| 1 | unknown |
...
| status_field | status_value | sensor_id |
+--------------+--------------+-----------+
| temperature | "0.9" | 1 |
| healthy | "false" | 1 |
...
However, this approach has its downsides - it can lead to unnecessary joins, single type (usually a string or a blob), and a single layer of indirection. With the rise of NoSQL databases, the focus has shifted towards more flexible data structures as a solution to these issues.
A more sensible approach is to store the status data in the same table as the sensors, using a structured format such as XML or JSON. For example, in PostgreSQL:
| sensor_id | sensor_location | sensor_status |
+-----------+-----------------+-------------------------+
| 1 | unknown | { "temperature" : 0.9, |
| | | "healthy" : false } |
...
This approach allows for structured data in the database and makes it possible to use SQL queries to quickly find all unhealthy sensors, for example:
SELECT sensor_id
FROM sensors
WHERE (sensor_status->>'healthy')::boolean = false;
Another great aspect of this solution is that it can also be used in the application logic by using converters that handle the data conversions. For example, by using the Jackson library, you can create a Status
class with annotations that allow it to work seamlessly with the JSON data stored in the database:
public class Status {
private final Map<String, String> properties = new HashMap<>();
@JsonAnyGetter
public Map<Srting, Object> properties() {
return properties;
}
@JsonAnySetter
public void set(String key, Object value) {
properties.put(key, value);
}
public Object get(String key) {
if (properties.containsKey(key)) {
return properties.get(key);
} else {
return null;
}
}
}
And a converter class can convert between the database data and the domain Status
object, taking care of the unpleasant serialization and deserialization work:
@Converter
public class ConfigConverter implements AttributeConverter<Config, String> {
private final ObjectMapper mapper = new ObjectMapper();
@Override
public String convertToDatabaseColumn(Status attribute) {
try {
return mapper.writeValueAsString(attribute);
} catch (JsonProcessingException e) {
throw new RuntimeException("Cannot serialize", e);
}
}
@Override
public Status convertToEntityAttribute(String dbData) {
if (dbData == null) {
return null;
}
try {
return mapper.readValue(dbData, Status.class);
} catch (IOException e) {
throw new RuntimeException("Cannot deserialize", e);
}
}
}
By using this approach, you can have a more flexible, dynamic and maintainable data structure with a well-structured format, like JSON, and you can perform complex queries on it by just writing SQL and you can use the structure also in your application logic for more flexibility and maintainability.
@Entity
@Table(name = "sensors")
public class Sensor {
@Id
@GeneratedValue
@Column(name = "sensor_id")
private Long id;
@Column(name = "sensor_location")
private String location;
@Column(name = "sensor_status")
@Convert(converter = StatusConverter.class)
private Status status;
...
}
One tiny note. You will need to use pgjdbc-ng
driver for this. At least this is the only one that worked for me.