Process home monitoring data using the Time Series Database in Bluemix

I keep a lot of information about my house – I have had sensors and recording units in various parts of my house years, recording info through a variety of different devices.

Over the years I’ve built a number of different solutions for storing and displaying the information, and when the opportunity came up to write about a database built specifically for recording this information I jumped at the change, and this is what I came up with:

As home automation increases, so does the number of sensors recording statistics and information needed to feed that data. Using the Time Series Database in BlueMix makes it easy to record the time-logged data and query and report on it. In this tutorial, we’ll examine how to create, store, and, ultimately, report on information by using the Time Series Database. We’ll also use the database to correlate data points across multiple sensors to track the effectiveness of heating systems in a multi-zone house.

You can read the full article here


Posted in bluemix, ibmdeveloperworks, webdevelopment, webservices | Tagged | Comments Off

Office 365 Activation Wont Accept Password

So today I signed up for Office 365, since it seemed to be the easiest way to get hold of Office; although I have a license and subscription, I also have more machines.

To say I was frustrated when I tried to activate Office 365 was an understatement. Each time I went through the process, it would reject the password saying there was a problem with my account.

I could login with my email and password online, but through the activation, no dice. Some internet searches, including with the ludicrously bad Windows support search didn’t elicit anything useful.

Then it hit me. Office 2011 for Mac through an Office 365 subscription probably doesn’t know about secondary authentication.

Sure enough, I created and application specific password, logged in with that, and yay, I now have a running Office 365 subscription.

If you are experiencing the same problem, using a application specific password might just help you out.


Posted in Coalface, microsoft, office | Tagged | Comments Off

Hadoop BoF Session at OSCON

I have a BoF session next week at OSCON next week:

Migrating Data from MySQL and Oracle into Hadoop

The session is at 7pm Tuesday night – look for rooms D135 and/or D137/138.

Correction: We are now in  E144 on Tuesday with the Hadoop get together first at 7pm, and the Data Migration to follow at 8pm.

I’m actually going to be joined by Gwen Shapira from Cloudera, who has a BoF session on Hadoop next door at the same time, along with Eric Herman from Booking.com. We’ll use the opportunity to talk all things Hadoop, but particularly the ingestion of data from MySQL and other databases into the Hadoop datastore.

As always, it’d be great to meet anybody interested in Hadoop at the BoF, please come along and introduce yourselves, and hopefully I’ll see you next week!


Posted in big data, cloudera, continuent, hadoop, oscon, oscon2014, Presentations and Conferences | Tagged | Comments Off

Making Real-Time Analytics a Reality — TDWI -The Data Warehousing Institute

My article on how to make the real-time processing of information from traditional transactional stores into Hadoop a reality has been published over at TDWI:

Making Real-Time Analytics a Reality — TDWI -The Data Warehousing Institute.


Posted in analytics, big data, data migration, hadoop, oracle | Tagged , , | Comments Off

Replicating Oracle Webinar Question Follow-up

We had really great webinar on Replicating to/from Oracle earliest this month, and you can view the recording of that Webinar here.

A good sign of how great a Webinar was is the questions that come afterwards, and we didn’t get through them all. so here are all the questions and answers for the entire webinar.

Q: What is the overhead of Replicator on source database with asynchronous CDC?

A: With asynchronous operation there is no substantial CPU overhead (as with synchronous), but the amount of generated redo logs becomes bigger requiring more disk space and better log management to ensure that the space is used effectively.

Q: Do you support migration from Solaris/Oracle to Linux/Oracle?

A: The replication is not certified for use on Solaris, however, it is possible to configure a replicator to operate remotely and extract from a remote Oracle instance. This is achieved by installing Tungsten Replicator on Linux and then extracting from the remote Oracle instance.

Q: Are there issues in supporting tables without Primary Keys on Oracle to Oracle replication?

A: Non-primary key tables will work, but it is not recommended for production as it implies significant overhead when applying to a target database.

Q: On Oracle->Oracle replication, if there are triggers on source tables, how is this handled?

A: Tungsten Replicator does not automatically disable triggers. The best solution is to remove triggers on slaves, or rewrite triggers to identify whether a trigger is being executed on the master or slave and skip it accordingly, although this requires rewriting the triggers in question.

Q: How is your offering different/better than Oracle Streams replication?

A: We like to think of ourselves as GoldenGate without the price tag. The main difference is the way we extract the information from Oracle, otherwise, the products offer similar functionality. For Tungsten Replicator in particular, one advantage is the open and flexible nature, since Tungsten Replicator is open source, released under a GPL V2 license, and available at https://code.google.com/p/tungsten-replicator/.

Q: How is the integrity of the replica maintained/verified?

A: Replicator has built-in real-time consistency checks: if an UPDATE or DELETE doesn’t update any rows, Replicator will go OFFLINE:ERROR, as this indicates an inconsistent dataset.

Q: Can configuration file based passwords be specified using some form of encrypted value for security purposes to keep them out of the clear?

A: We support an INI file format so that you do not have to use the command-line installation process. There is currently no supported option for an encrypted version of these values, but the INI file can be secured so it is only readable by the Tungsten user.

Q: Our source DB is Oracle RAC with ~10 instances. Is coherency maintained in the replication from activity in the various instances?

A: We do not monitor the information that has been replicated; but CDC replicates row-based data, not statements, so typical sequence insertion issues that might occur with statement based replication should not apply.

Q: Is there any maintenance of Oracle sequence values between Oracle and replicas?

A: Sequence values are recorded into the row data as extracted by Tungsten Replicator. Because the inserted values, not the sequence itself, is replicated, there is no need to maintain sequences between hosts.

Q: How timely is the replication? Particularly for hot source tables receiving millions of rows per day?

A: CDC is based on extracting the data at an interval, but the interval can be configured. In practice, assuming there are regular inserts and updates on the Oracle side, the data is replicated in real-time. See https://docs.continuent.com/tungsten-replicator-3.0/deployment-oracle-cdctuning.html for more information on how this figure can be tuned.

Q: Can parallel extractor instances be spread across servers rather than through threads on the same server (which would be constrained by network or HBA)?

A: Yes. We can install multiple replicators and tune the extraction of the parallel extractor accordingly. However, that selection would need to be manual, but certainly that is possible.

Q: Do you need the CSV file (to select individual tables with the setupCDC.sh configuration) on the master setup if you want all tables?

A: No.

Q: If you lose your slave down the road, do you need to re-provision from the initial SCN number or is there a way to start from a later point?

A: This is the reason for the THL Sequence Number introduced in the extractor. If you lose your slave, you can install a new slave and have it start at the transaction number where the failed slave stopped if you know it, since the information will be in the THL. If not, you can usually determine this by examining the THL directly. There should be no need to re-provision – just to restart from the transaction in the THL on the master.

Q: Regarding a failed slave, what if it failed such that we don’t have a backup or wanted to provision a second slave such that it had no initial data.

A: If you had no backups or data, yes, you would need to re-provision with the parallel extractor in order to seed the target database.

Q: Would you do that with the original SCN? If it had been a month or two, is there a way to start at a more recent SCN (e.g. you have to re-run the setupCDC process)?

A: The best case is to have two MySQL slaves and when one fails, you re-provision it from the healthy one. This avoids setupCDC stage.

However, the replication can always be started from a specific event (SCN) provided that SCN is available in the Oracle undo log space.

Q: How does Tungsten handle Oracle’s CLOB and BLOB data types

A: Providing you are using asynchronous CDC these types are supported; for synchronous CDC these types are not supported by Oracle.

Q: Can different schemas in Oracle be replicated at different times?

A: Each schema is extracted by a separate service in Replicator, so they are independent.

Q: What is the size limit for BLOB or CLOB column data types?

A: This depends on the CDC capabilities in Oracle, and is not limited within Tungsten Replicator. You may want to refer to the Oracle Docs for more information on CDC: http://docs.oracle.com/cd/B28359_01/server.111/b28313/cdc.htm

Q: With different versions of Oracle e.g. enterprise edition and standard edition one be considered heterogeneous environments?

A: Essentially yes, although the nomenclature is really only a categorization, it does not affect the operation, deployment or functionality of the replicator. All these features are part of the open source product.

Q: Can a 10g database (master) send the data to a 11g database (slave) for use in an upgrade?

A: Yes.

Q: Does the Oracle replicator require the Oracle database to be in archive mode?

A: Yes. This is a requirement for Oracle’s CDC implementation.

Q: How will be able to revisit this recorded webinar?

A: Slides and a recording from today’s webinar will be available at http://www.slideshare.net/Continuent_Tungsten

 


Posted in continuent, oracle, replication | Tagged , , | Comments Off

A New Home for Tungsten in the UK

I was suitably heartened to hear about the new mine opening up in the Devon here in the UK to mine the element Tungsten.

I comment on this to my associates at Continuent, where comments were made by Csaba as to the appropriate quotes in the article:

“Tungsten is an extraordinary metal.”

“It’s almost as hard as a diamond and has one of the highest melting points of any mineral.”

“Adding a small amount to steel makes it far harder, far more resistant to stress and heat. The benefits to industry are obvious.”

Leading to him to suggest Adding a small amount of Tungsten to MySQL makes it far harder, far more resistant to stress and failures. The benefits to industry are obvious.

I couldn’t possibly agree more!

  


Posted in continuent | Tagged , | Comments Off

Continuent at Hadoop Summit

I’m pleased to say that Continuent will be at the Hadoop Summit in San Jose next week (3-5 June). Sadly I will not be attending as I’m taking an exam next week, but my colleagues Robert Hodges, Eero Teerikorpi and Petri Versunen will be there to answer any questions you have about Continuent products, and, of course, Hadoop replication support built into Tungsten Replicator 3.0.

If you are at the conference, please go along and say hi to the team. And, as always, if there are any questions please let them or me know.


Posted in big data, continuent, hadoop, oracle, Presentations and Conferences | Tagged | Comments Off

Real-Time Data Movement: The Key to Enabling Live Analytics With Hadoop

An article about moving data into Hadoop in real-time has just been published over at DBTA, written by me and my CEO Robert Hodges.

In the article I talk about one of the major issues for all people deploying databases in the modern heterogenous world – how do we move and migrate data effectively between entirely different database systems in a way that is efficient and usable. How do you get the data you need to the database you need it in. If your source is a transactional database, how does that data get moved into Hadoop in a way that makes the data usable to be queried by Hive, Impala or HBase?

You can read the full article here: Real-Time Data Movement: The Key to Enabling Live Analytics With Hadoop

 


Posted in big data, hadoop, oracle | Tagged , , | Comments Off

Cross your Fingers for Tech14, see you at OSCON

i

So I’ve submitted my talks for the Tech14 UK Oracle User Group conference which is in Liverpool this year. I’m not going to give away the topics, but you can imagine they are going to be about data translation and movement and how to get your various databases talking together.

I can also say, after having seen other submissions for talks this year (as I’m helping to judge), that the conference is shaping up to be very interesting. There’s a good spread of different topics this year, but I know from having talked to the organisers that they are looking for more submissions in the areas of Operating Systems, Engineered Systems and Development (mobile and cloud).

If you’ve got a paper, presentation, or idea for one that you think would be useful, please go ahead and submit your idea.

I’m also pleased to say that I’ll be at OSCON in Oregon in July, handling a Birds of a Feather (BOF) session on the topic of exchanging data between MySQL, Oracle and Hadoop. I’ll be there with my good friend Eric Herman from Booking.com where we’ll be providing advice, guidance, experiences, and hoping to exchange more ideas, wishes and requirements for heterogeneous environments.

It’d be great to meet you if you want to come along to either conference.

 

 


Posted in big data, conferences, continuent, hadoop, oracle, Presentations and Conferences, ukoug | Tagged , | Comments Off

Passion for Newspaper Comics? Watch Stripped

I’m a big fan of comics – and although I am a fan of Spiderman, Superman, and my personal favourite, 2000AD – what I’m really talking about is the newspaper comics featuring stars like Garfield, Dilbert, and Calvin and Hobbes.

Unfortunately being in the UK, before the internet existed in it’s current form, finding these comics, particularly from the US was difficult. We don’t have many US comics in UK newspapers, and to be honest, very few papers in the UK have a good variety of any comic. That made feeding the habit difficult, as I would trawl, literally, around bookstores in the humour section to find the books I needed.

Garfield was my first foray into the market, and I bought one of the first books not long after it came out. Then, as I started looking around a little more I came across others, like Luann, For Better or For Worse, before finding the absolute joy that was Calvin and Hobbes before ultimately getting hold of Foxtrot, Sherman’s Lagoon and many many more.

Of course, the Internet has made these hugely accessible, and indeed not only do I read many comics every day, but I very deliberately subscribe (and by that, I mean pay money) to both Comics Kingdom (43 daily comics in my subscription) and GoComics.com (72 daily comics) I also continue to the buy the books. Why?

Because at the end of the today looking at screens and taxing the brain, what I really want to do is chill and read some still intelligent, but not mentally taxing, content, and that means reading my comic books. They give me a break and giggle and I find that a nice way to go to sleep.

The more important reason, though, is because I enjoy these comics and believe these people should be rewarded for their efforts. Honestly, these guys work their laughter muscles harder than most people I know, creating new jokes, every day, that make me laugh. They don’t just do this regularly, or even frequently. They do it *every day*.

As a writer I know how hard it is to create new content every day, and keep it interesting. I cannot imagine how hard it is to keep doing it, and making it funny and enjoyable for people to read.

Over the years, I’ve also bought a variety of interesting things, including the massive Dilbert, Calvin & Hobbes and Far Side collectibles. I own complete collections of all the books for my favourite authors, and I’ve even contacted the authors directly when I haven’t been able to get them from the mighty Amazon. To people like Hilary B Price (Rhymes with Orange), Tony Carillo (F-Minus), Scott Hilburn (The Argyle Sweater), Leigh Rubin (Rubes) and Dave Blazek (Loose Parts) I thank you for your help in feeding my addiction. To Mark Leiknes (the now defunct Cow & Boy), I thank you for the drawings from your drawing board and notebook, and I’m Sorry it didn’t work out.

But to Dave Kellett & Fred Schroeder I owe a debt of special gratitude. Of course Dave Kellett writes the excellent Sheldon, and not only do I have the full set, Dave signed them first. I’ve also got one of the limited editions Arthur’s…

But together, they produced the wonderful Stripped! which I funded through Kickstarter along with so many others (you can even see my name in the credits!). If you have any interest in how comics are drawn, where the ideas come from, and how difficult the whole process is, you should watch it. Even more, you should watch it if you want to know what these people look like.

Comic artists are people who for some people we don’t even know their name, but for some we might know, but probably very few who we ever get to see what they look like. Yet these people are superstars. Really. Think about it, they write the screenplay, direct it, produce it, provide all the special effects, act all the parts, and do all the voices. And despite wearing all of these different hats, every day, they can still be funny, and, like all good comedy, thought provoking.

For me there is one poignant moment in the film too. Understanding how, in a world where newspapers and comic syndication is dwindling fast, how these people expect to make a living. The Internet is a great way for comic artists to get exposure to an ever growing army of fans, but I think there’s going to be an interesting crossover period for those comics that started out in the papers.

The film itself is great. Not only do you get to see these comic artist gods, but you get to understand their passion and interest, and why they do what they do. That goes a long way to helping you empathise with them and their passion in line with you and your passion – reading them.

If you like comics, find a way of giving some money back to these people, whether it’s a subscription, buying their books or buying merchandise.

 


Posted in Coalface, comics, passion, stripped | Comments Off