[Fiware-lab-federation-nodes] [CESNET #134122] Re: experiences with HA

Cristian Cristelotti cristian.cristelotti.coll at trentinonetwork.it
Tue Nov 17 13:57:49 CET 2015


I had understood KILO with HA, but with the morning call I'm not so sure about HA.

Cristian

----- Messaggio originale -----
Da: "Sean Murphy" <murp at zhaw.ch>
A: "Federico Michele Facca" <federico.facca at create-net.org>
Cc: "Cristian Cristelotti" <cristian.cristelotti.coll at trentinonetwork.it>, fiware-lab-federation-nodes at lists.fiware.org
Inviato: Martedì, 17 novembre 2015 13:17:50
Oggetto: Re: [Fiware-lab-federation-nodes] [CESNET #134122] Re: experiences with HA


Hi again all, 


To follow up on this after the discussion on the confcall this morning (which 
I found v useful - it might be good if we have more discussion of these 
important issues on the calls from time to time). 


It was not clear to me the status of the Spanish node: I did not concretely 
understand what Fernando said regarding HA. From previous communication, 
I understand that they chose not to use HA in Juno; in the meetings of the 
minutes from today, I see 


"Migrated to Kilo, pending swift migration (waiting help from IBM)" 



@Fernando - can you tell us if you went with HA in Kilo? 


BR, 
Seán. 




On Mon, Nov 16, 2015 at 9:27 AM, Murphy Seán (murp) < murp at zhaw.ch > wrote: 




Hi Fede, all, 








juno HA is quite stable in our experience. the problems are always related to the neutron when you restart a 


Good to hear. 





node. so rule number one, if you need to restart, use corosynch to call out your node. this will do a graceful re-balancing among l3 agents. in case of sudden "death" of the node, the problem is not much in that, but when you re-attach the node. also in this case correct management of corosynch is the trick. 


Thanks for the pointers - I may ask for more info on the confcall as I don't fully 
get the point here. Also, it would be good to know if this also applies to Kilo. 





In case you have not noticed, following the new dow in FI-CORE and the Open Call, requirements on SLA and availability are quite strict, so if your node dies because the only controller you have is un-recoverable, and because of that you breach the required availability threshold, this may have financial implications for FI-CORE nodes. 


Thanks for pointing that out. I guess everyone has a strong interest in having the 
systems as reliable as possible - unreliable systems give lots of headaches. I guess 
what I was interested in knowing is whether HA is likely to make the system more 
reliable or less reliable: the experience in XiFi was that it seemed to make things 
less reliable. 


BR, 
Seán. 
















Br, 


Federico 


-- 
Future Internet is closer than you think! 
http://www.fiware.org 

Official Mirantis partner for OpenStack Training 
https://www.create-net.org/community/openstack-training 

-- 
Dr. Federico M. Facca 

CREATE-NET 
Via alla Cascata 56/D 
38123 Povo Trento (Italy) 

P +39 0461 312471 
M +39 334 6049758 
E federico.facca at create-net.org 
T @chicco785 
W www.create-net.org 



On Fri, Nov 13, 2015 at 11:54 AM, Theofanis Katsiaounis < th_katsiaounis at neuropublic.gr > wrote: 



Hi all, 
Indeed Kilo could solve the network issues since networking is HA 
capable too. 
Containers/Swift can be a problem especially since you have to leave 
space to create the storage rings etc. 

Regards, 
Fanis 

On 13/11/2015 12:50 μμ, Cristian Cristelotti wrote: 


> Hi Sean, 
> 
> Our experience with Grizzly (HA) was very bad. IceHouse (HA) was better but not stable . Now we are with JUNO on single-node and we haven't faced any problem . 
> We are working on the migration to KILO (HA + murano + ceilometer ). 
> 
> KILO seems to have solved the problems mentioned by Fanis. 
> If you'll not deploy the node with HA you'll not have containers functionality or better you have to install swift manually after fuel deployment. 
> 
> 
> 
> Regards 
> 
> Cristian 
> 
> ----- Messaggio originale ----- 
> Da: "Sean Murphy" <murp@ zhaw.ch > 
> A: "Theofanis Katsiaounis" <th_katsiaounis@ neuropublic.gr > 
> Cc: fiware-lab-federation-nodes@ lists.fiware.org 
> Inviato: Venerdì, 13 novembre 2015 11:40:13 
> Oggetto: Re: [Fiware-lab-federation-nodes] [CESNET #134122] Re: experiences with HA 
> 
> 
> 
> Hi all, 
> 
> 


> So the feedback so far is the following: 
> - Riwal says that running Juno/HA is not so problematic, but has not had a specific failure 
> situation where HA could really be tested 
> - Fernando notes that Juno/HA exhibited stability problems for larger numbers of users and 
> decided against it 
> - Fanis notes that Icehouse/HA was quite problematic in multiple respects 
> 
> 
> >From our pov, this is not painting a v positive picture regarding HA and despite 
> our inclination to experiment with newer technologies we would prob opt not to 
> use HA. 
> 
> 
> Does anyone in the project have Kilo/HA experience? 
> 
> 
> BR, 
> Seán. 
> 
> 
> 
> 
> 
> 
> On Fri, Nov 13, 2015 at 10:38 AM, Theofanis Katsiaounis < th_katsiaounis@ neuropublic.gr > wrote: 
> 
> 
> 
> Hi all, 
> we had HA on Icehouse and it was a mess. Especially with the Networking/Neutron part. Namespaces were not transfered between nodes so if one went down vm's lost networking. Reboots were a lottery indeed, sometimes they worked sometimes they did not. And when we lost power once i had to rebuild the node. 
> Of course the FIWARE lab handbook asks for an HA solution but i see in the case of Spain this has already been violated ;). 
> My two cents is that the guys from Spain made the right choice. I do not think HA in openstack is ready for production especially with a big number of users. 
> 
> Regards, 
> Fanis 
> 
> 
> On 13/11/2015 11:33 πμ, Riwal KERHERVE wrote: 
> 
> 
> 
> 
> 
> Sean, 
> 
> 
> 
> In Grizzly, anytime we needed to restart processes handled by CRM, it was a lottery. Sometimes, everything went fine and sometimes the processes keep on rebooting and it take us hours to put back things in order. 
> 
> In Juno, we never experienced this kind of behavior. When we needed to restart processes trough CRM, all always went fine. 
> 
> 
> 
> To answer to your question: 
> 
> The only time, we played with HA, it was to take into account some modification in our configuration files. I do not recall exercising HA capabilities, like the need of putting one node down and switching all processes to the other node. 
> 
> 
> 
> BR 
> 
> Riwal 
> 
> 
> 
> De : sean@ gopaddy.ch [ mailto: sean@ gopaddy.ch ] De la part de Sean Murphy 
> Envoyé : jeudi 12 novembre 2015 17:01 
> À : Riwal KERHERVE 
> Cc : fiware-lab-federation-nodes@ lists.fiware.org 
> Objet : Re: [CESNET #134122] Re: [Fiware-lab-federation-nodes] experiences with HA 
> 
> 
> 
> 
> 
> 
> Hi Riwal, 
> 
> 
> 
> 
> 
> Good feedback - thanks for that. 
> 
> 
> 
> 
> 
> As a matter of interest, have you ever needed to exercise any of the HA 
> 
> 
> capabilities or have you tested it in anger? 
> 
> 
> 
> 
> 
> BR, 
> 
> 
> Seán. 
> 
> 
> 
> 
> 
> On Thu, Nov 12, 2015 at 4:51 PM, Riwal KERHERVE via RT < xifi-support@ rt.cesnet.cz > wrote: 
> 
> Sean, 
> 
> I do not have experience with Kilo in HA, but our node is in Juno and in HA. We installed it with fuel 6.0 (2 controllers and 1 Arbitrator) 
. We never have any trouble until now: very stable, nothing to be with HA in grizzly. 
> 
> BR 
> Riwal 
> 
> De : fiware-lab-federation-nodes-bounces@ lists.fiware.org [mailto: fiware-lab-federation-nodes-bounces@ lists.fiware.org ] De la part de Sean Murphy 
> Envoyé : jeudi 12 novembre 2015 16:33 
> À : fiware-lab-federation-nodes@ lists.fiware.org 
> Objet : [Fiware-lab-federation-nodes] experiences with HA 
> 
> 
> 
> 
> Hi all, 
> 
> We're looking at our upgrade strategy and we're curious to 
> hear any experience with Kilo HA both from the deployment 
> perspective as well as the operations perspective. 
> 
> >From xifi, I remember Fanis reporting a split-brain scenario 
> with HA and in the end he opted not to go with a HA solution; 
> this gives me pause for thought when considering this 
> deployment solution, even though it seems to be the 
> preferred solution. 
> 
> Generally, we would be well disposed to a HA deployment 
> as we would like to learn about it, but we do not want to 
> end up deploying a technology that is too far from production 
> readiness. 
> 
> Does anyone have any experience that they can share on this 
> point? 
> 
> BR, 
> Seán. 
> 
> 
> 
> 
> 
> _______________________________________________ 
> Fiware-lab-federation-nodes mailing list Fiware-lab-federation-nodes@ lists.fiware.org https://lists.fiware.org/listinfo/fiware-lab-federation-nodes 
> 
> 
> Αποποίηση ευθυνών / Disclaimer 
> 
> 
> _______________________________________________ 
> Fiware-lab-federation-nodes mailing list 
> Fiware-lab-federation-nodes@ lists.fiware.org 
> https://lists.fiware.org/listinfo/fiware-lab-federation-nodes 
> 



Αποποίηση ευθυνών / Disclaimer 



_______________________________________________ 
Fiware-lab-federation-nodes mailing list 
Fiware-lab-federation-nodes at lists.fiware.org 
https://lists.fiware.org/listinfo/fiware-lab-federation-nodes 

-- 
Cristian Cristelotti

Collaboratore di Trentino Network Srl






More information about the Fiware-lab-federation-nodes mailing list

You can get more information about our cookies and privacy policies clicking on the following links: Privacy policy   Cookies policy