The problem that I face is how to combine encapsulating and optimal memory use.
ID: 655007 • Letter: T
Question
The problem that I face is how to combine encapsulating and optimal memory use.
I can't show you my code and therefore explain it on extensive (I hope) example.
Let's say we need to have a database of mans. We want to know only 2 things about those people:
Age of the man (in hours from birth).
Name of the town he lives in.
The convenient and natural way to manage this data is to create an object, which corresponds to a man and store it in an array:
class OMan1 {
public:
OMan( const int &age, const astring &t ): fAge(age), fTown(t) {}
const int& age() const: { return fAge; }
const astring& Town() const: { return fTown; }
astring FullId() const: { return fTown+fAge; }
private:
int fAge;
astring fTown;
}
OMan mans[N];
Here our OMans are self-containing object and everything fills nice.
Except of the fact that we clone names of the towns thousands of times, and waste memory and execution time this way.
An improvement we can do is to make an independent array for town names and for each OMan, store only the age, an id of the town and a pointer to towns array:
class OMan2 {
// same functionality as for OMan1
int fAge;
int fTownId;
astring* fTowns;
}
object is still self-contained, sizeof(int) + sizeof(void*) is much less then sizeof(astring), we win a lot. But still it is factor of 2-3 more than sizeof(fAge) and we repeat fTowns billions of times.
Memory optimisation is crucial for me, therefore what I do is keeping only fAge and fTownId and move such functionality as Town() and FullId() out of the OMan class to some class like OManDataBase:
class OMan3 {
public:
OMan( const int &age, const int &tid ): fAge(age), fTownId(tid) {}
const int& age() const: { return fAge; }
const int& TownId() const: { return fId; }
// const astring& Town() const: { return fTown; }
// astring FullId() const: { return fTown+fAge; }
private:
int fAge;
int fTownId;
}
class OManDataBase {
// constructor, destructor
const int& age( const int& i) const: { return fMans[i].TownId()]; }
const astring& Town( const int& i) const: { return fTown[fMans[i].TownId()]; }
const astring& FullId( const int& i) const: { return Town(i)+age(i); }
private:
vector<OMan3> fMans;
vector<astring> fTowns;
}
And OMan3 now is not self contained object. It doesn't know it's fullname, for instance. That means that if I need to do some data processing with one man I have to use whole OManDataBase instance:
OBillType47 NewBillType47( const OManDataBase &db, int i ) { ... }
instead of
OBillType47 NewBillType47( const OMan &m ) { ... }
encapsulation has been broken here and the code readability has been clearly decreased. (I put Type47 to emphasise that I can have a lot of functions, which works with Oman-S and can't include all of them into OManDataBase class).
I wonder is there any other way(-s) to solve data duplication problem, keeping objects as self-containing as possible?
Explanation / Answer
Here's some naive advice I hope you find helpful to conserve memory...
Solution 1
Instead of using a C++ class to store the data, use "packed" structures (look at the documentation for your compiler to figure out how to create a packed structure, there's usually a compiler flag involved). Then limit the size of your variables memory footprints, for instance a uint8 should be big enough to store a mans age. Once you've done this store the packed structure in a c style array (for maximum compression) or use std::vector<OManStruct> if you want to make your life easier. Instead of storing the name of the town as a string in the struct, create a std::map<uint32_t, string> that maps townIds to town names. Store the townId in the struct.
Your struct definition might look something like this (if your using GCC):
typedef struct __attribute__((__packed__)) OManStruct
{
uint8_t fAge;
uint32_t fTownId; //maybe you could use uint16_t here but you might be cutting it close
};
Architecturally I would wrap the knowledge of these data structures in a class that maintains the array, associated data (number of elements etc), and exposes some nice getters and setters so the outside world doesn't know about this implementation detail. Or better yet create, an OManFactory which can be treated as a singleton in your project and returns OMan objects on request.
Solution 2
You might want to seriously consider databasing your data using SQLite, Postgres etc or some nosql database like Redis.. If the amount of data is going to grow over time there's no guarantee you might not blow through all your ram even if you implement solution 1. A database would also give you the ability to store the data persistently and give you a nice mechanism (if using an sql type database) for doing queries on the data.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.